Overview

Dataset statistics

Number of variables10
Number of observations1048575
Missing cells1510399
Missing cells (%)14.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory80.0 MiB
Average record size in memory80.0 B

Variable types

Numeric5
DateTime1
Categorical3
Unsupported1

Alerts

deleted has constant value "0" Constant
source_id is highly correlated with source_type and 1 other fieldsHigh correlation
source_type is highly correlated with source_idHigh correlation
value is highly correlated with device_idHigh correlation
device_id is highly correlated with source_id and 1 other fieldsHigh correlation
source_id is highly correlated with source_type and 1 other fieldsHigh correlation
source_type is highly correlated with source_idHigh correlation
device_id is highly correlated with source_idHigh correlation
source_id is highly correlated with source_type and 1 other fieldsHigh correlation
source_type is highly correlated with source_idHigh correlation
device_id is highly correlated with source_idHigh correlation
deleted is highly correlated with source_type and 1 other fieldsHigh correlation
source_type is highly correlated with deletedHigh correlation
type is highly correlated with deletedHigh correlation
type is highly correlated with source_id and 4 other fieldsHigh correlation
source_id is highly correlated with type and 3 other fieldsHigh correlation
source_type is highly correlated with type and 1 other fieldsHigh correlation
value is highly correlated with type and 1 other fieldsHigh correlation
device_id is highly correlated with type and 2 other fieldsHigh correlation
zone_id is highly correlated with type and 3 other fieldsHigh correlation
device_id has 461824 (44.0%) missing values Missing
deleted_date has 1048575 (100.0%) missing values Missing
id is uniformly distributed Uniform
id has unique values Unique
deleted_date is an unsupported type, check if it needs cleaning or further analysis Unsupported
source_id has 24939 (2.4%) zeros Zeros
zone_id has 61451 (5.9%) zeros Zeros

Reproduction

Analysis started2022-08-19 09:25:35.274419
Analysis finished2022-08-19 09:26:18.386490
Duration43.11 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1048575
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean524866
Minimum579
Maximum1049153
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2022-08-19T05:26:18.730016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum579
5-th percentile53007.7
Q1262722.5
median524866
Q3787009.5
95-th percentile996724.3
Maximum1049153
Range1048574
Interquartile range (IQR)524287

Descriptive statistics

Standard deviation302697.6736
Coefficient of variation (CV)0.5767141968
Kurtosis-1.2
Mean524866
Median Absolute Deviation (MAD)262144
Skewness0
Sum5.50361366 × 1011
Variance9.16258816 × 1010
MonotonicityStrictly increasing
2022-08-19T05:26:18.986050image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5791
 
< 0.1%
6996341
 
< 0.1%
6996211
 
< 0.1%
6996221
 
< 0.1%
6996231
 
< 0.1%
6996241
 
< 0.1%
6996251
 
< 0.1%
6996261
 
< 0.1%
6996271
 
< 0.1%
6996281
 
< 0.1%
Other values (1048565)1048565
> 99.9%
ValueCountFrequency (%)
5791
< 0.1%
5801
< 0.1%
5811
< 0.1%
5821
< 0.1%
5831
< 0.1%
5841
< 0.1%
5851
< 0.1%
5861
< 0.1%
5871
< 0.1%
5881
< 0.1%
ValueCountFrequency (%)
10491531
< 0.1%
10491521
< 0.1%
10491511
< 0.1%
10491501
< 0.1%
10491491
< 0.1%
10491481
< 0.1%
10491471
< 0.1%
10491461
< 0.1%
10491451
< 0.1%
10491441
< 0.1%
Distinct101890
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
Minimum2016-08-02 12:32:18
Maximum2016-08-03 23:58:22
2022-08-19T05:26:19.230953image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:19.514093image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
2
419535 
11
224480 
8
211030 
4
193530 

Length

Max length2
Median length1
Mean length1.214081015
Min length1

Characters and Unicode

Total characters1273055
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row8
3rd row8
4th row2
5th row8

Common Values

ValueCountFrequency (%)
2419535
40.0%
11224480
21.4%
8211030
20.1%
4193530
18.5%

Length

2022-08-19T05:26:19.777048image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-19T05:26:19.988881image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
2419535
40.0%
11224480
21.4%
8211030
20.1%
4193530
18.5%

Most occurring characters

ValueCountFrequency (%)
1448960
35.3%
2419535
33.0%
8211030
16.6%
4193530
15.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1273055
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1448960
35.3%
2419535
33.0%
8211030
16.6%
4193530
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common1273055
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1448960
35.3%
2419535
33.0%
8211030
16.6%
4193530
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1273055
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1448960
35.3%
2419535
33.0%
8211030
16.6%
4193530
15.2%

source_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct44
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.40758172
Minimum0
Maximum58
Zeros24939
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2022-08-19T05:26:20.196365image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median5
Q335
95-th percentile51
Maximum58
Range58
Interquartile range (IQR)32

Descriptive statistics

Standard deviation18.07786195
Coefficient of variation (CV)1.038505075
Kurtosis-0.9170016318
Mean17.40758172
Median Absolute Deviation (MAD)4
Skewness0.7817356947
Sum18253155
Variance326.8090928
MonotonicityNot monotonic
2022-08-19T05:26:20.425005image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
3265848
25.4%
293832
 
8.9%
577736
 
7.4%
147026
 
4.5%
439408
 
3.8%
024939
 
2.4%
3524264
 
2.3%
3624128
 
2.3%
1823638
 
2.3%
1723638
 
2.3%
Other values (34)404118
38.5%
ValueCountFrequency (%)
024939
 
2.4%
147026
 
4.5%
293832
 
8.9%
3265848
25.4%
439408
 
3.8%
577736
 
7.4%
723620
 
2.3%
95328
 
0.5%
105327
 
0.5%
112347
 
0.2%
ValueCountFrequency (%)
586130
0.6%
5711568
1.1%
5411568
1.1%
533085
 
0.3%
5212345
1.2%
5113816
1.3%
5013852
1.3%
4913849
1.3%
4813852
1.3%
4713852
1.3%

source_type
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
0
586751 
1
461824 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1048575
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

Length

2022-08-19T05:26:20.665625image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-19T05:26:20.833445image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

Most occurring characters

ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1048575
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

Most occurring scripts

ValueCountFrequency (%)
Common1048575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1048575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0586751
56.0%
1461824
44.0%

value
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct8848
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean187.4722094
Minimum0.1
Maximum2044
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2022-08-19T05:26:21.031486image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.14
Q114.2
median73.49
Q3179
95-th percentile864
Maximum2044
Range2043.9
Interquartile range (IQR)164.8

Descriptive statistics

Standard deviation287.0248363
Coefficient of variation (CV)1.53102605
Kurtosis4.507448884
Mean187.4722094
Median Absolute Deviation (MAD)69.49
Skewness2.174587786
Sum196578672
Variance82383.25666
MonotonicityNot monotonic
2022-08-19T05:26:21.311280image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1322393
 
2.1%
82.7422259
 
2.1%
84.320785
 
2.0%
81.4515820
 
1.5%
0.2415053
 
1.4%
214647
 
1.4%
0.1214394
 
1.4%
1.513237
 
1.3%
80.1612303
 
1.2%
0.1410053
 
1.0%
Other values (8838)887631
84.7%
ValueCountFrequency (%)
0.17115
 
0.7%
0.11524
 
< 0.1%
0.1214394
1.4%
0.1322393
2.1%
0.1410053
1.0%
0.154445
 
0.4%
0.161713
 
0.2%
0.171875
 
0.2%
0.185835
 
0.6%
0.197131
 
0.7%
ValueCountFrequency (%)
20443
< 0.1%
19843
< 0.1%
19743
< 0.1%
19343
< 0.1%
19243
< 0.1%
19153
< 0.1%
19013
< 0.1%
18863
< 0.1%
18703
< 0.1%
18683
< 0.1%

device_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)< 0.1%
Missing461824
Missing (%)44.0%
Infinite0
Infinite (%)0.0%
Mean8.109107611
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2022-08-19T05:26:21.535254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q315
95-th percentile15
Maximum15
Range14
Interquartile range (IQR)11

Descriptive statistics

Standard deviation4.80401161
Coefficient of variation (CV)0.5924217362
Kurtosis-1.270267306
Mean8.109107611
Median Absolute Deviation (MAD)4
Skewness0.2614423216
Sum4758027
Variance23.07852755
MonotonicityNot monotonic
2022-08-19T05:26:21.702707image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
15150473
 
14.4%
873150
 
7.0%
664254
 
6.1%
363345
 
6.0%
560755
 
5.8%
1053770
 
5.1%
147240
 
4.5%
434704
 
3.3%
1126058
 
2.5%
213002
 
1.2%
(Missing)461824
44.0%
ValueCountFrequency (%)
147240
 
4.5%
213002
 
1.2%
363345
6.0%
434704
 
3.3%
560755
5.8%
664254
6.1%
873150
7.0%
1053770
 
5.1%
1126058
 
2.5%
15150473
14.4%
ValueCountFrequency (%)
15150473
14.4%
1126058
 
2.5%
1053770
 
5.1%
873150
7.0%
664254
6.1%
560755
5.8%
434704
 
3.3%
363345
6.0%
213002
 
1.2%
147240
 
4.5%

zone_id
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9105157
Minimum0
Maximum5
Zeros61451
Zeros (%)5.9%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2022-08-19T05:26:22.180208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q33
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.238552958
Coefficient of variation (CV)0.4255441598
Kurtosis0.2941330708
Mean2.9105157
Median Absolute Deviation (MAD)1
Skewness-0.291233769
Sum3051894
Variance1.53401343
MonotonicityNot monotonic
2022-08-19T05:26:22.409758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3516231
49.2%
2184004
 
17.5%
5145152
 
13.8%
489232
 
8.5%
061451
 
5.9%
152505
 
5.0%
ValueCountFrequency (%)
061451
 
5.9%
152505
 
5.0%
2184004
 
17.5%
3516231
49.2%
489232
 
8.5%
5145152
 
13.8%
ValueCountFrequency (%)
5145152
 
13.8%
489232
 
8.5%
3516231
49.2%
2184004
 
17.5%
152505
 
5.0%
061451
 
5.9%

deleted
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
0
1048575 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1048575
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01048575
100.0%

Length

2022-08-19T05:26:22.628654image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-19T05:26:23.006172image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
01048575
100.0%

Most occurring characters

ValueCountFrequency (%)
01048575
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1048575
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01048575
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1048575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01048575
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1048575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01048575
100.0%

deleted_date
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing1048575
Missing (%)100.0%
Memory size8.0 MiB

Interactions

2022-08-19T05:26:11.401974image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:01.795673image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:04.313195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:07.019971image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:09.187335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:11.870724image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:02.371388image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:04.923295image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:07.475219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:09.547378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-19T05:26:12.361459image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/