Guest User

sqlite3_bad_performance

a guest
Jan 25th, 2015
263
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
HTML 5 3.62 KB | None | 0 0
  1. My Process.csv is around 27G. I've gzipped it and put at ftp://navinps:[email protected] as process.csv.gz
  2.  
  3. There is only 1 file there.
  4. md5sum process.csv.gz
  5. e77a322744a26d4c8a1ad4d61a84ee72  process.csv.gz
  6.  
  7.  [root@centosnavin sqlite-autoconf-3080801]# cat sqlite3commands.txt
  8. CREATE TABLE [hp_table1] ( InstanceId INTEGER, LogTime INTEGER, ArrivalTime INTEGER, CollectionTime INTEGER, [dml_PROC_TIME] TIME, [dml_PROC_INTERVAL] INTEGER, [dml_PROC_INTEREST] TEXT, [dml_PROC_TOP_CPU_INDEX] INTEGER, [dml_PROC_TOP_DISK_INDEX] INTEGER, [dml_PROC_STATE_FLAG] INTEGER, [dml_PROC_RUN_TIME] REAL, [dml_PROC_STOP_REASON_FLAG] INTEGER, [dml_PROC_INTERVAL_ALIVE] INTEGER, [dml_PROC_STOP_REASON] TEXT, [dml_PROC_STATE] TEXT, [dml_PROC_PRI] INTEGER, [dml_PROC_NICE_PRI] INTEGER, [dml_PROC_CPU_LAST_USED] INTEGER, [dml_PROC_CPU_SWITCHES] INTEGER, [dml_PROC_IO_BYTE] REAL, [dml_PROC_VOLUNTARY_CSWITCH] INTEGER, [dml_PROC_FORCED_CSWITCH] INTEGER, [dml_PROC_IO_BYTE_RATE] REAL, [dml_PROC_CPU_TOTAL_UTIL] REAL, [dml_PROC_CPU_TOTAL_TIME] REAL, [dml_PROC_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYS_MODE_TIME] REAL, [dml_PROC_CPU_USER_MODE_UTIL] REAL, [dml_PROC_CPU_USER_MODE_TIME] REAL, [dml_PROC_THREAD_COUNT] INTEGER, [dml_PROC_CPU_ALIVE_TOTAL_UTIL] REAL , [dml_PROC_CPU_ALIVE_USER_MODE_UTIL]
  9.  REAL, [dml_PROC_CPU_ALIVE_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYSCALL_UTIL] REAL, [dml_PROC_CPU_SYSCALL_TIME] REAL, [dml_PROC_CHILD_CPU_USER_MODE_UTIL] REAL, [dml_PROC_CHILD_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CHILD_CPU_TOTAL_UTIL] REAL, [dml_PROC_DISK_PHYS_READ] INTEGER, [dml_PROC_DISK_PHYS_READ_RATE] REAL, [dml_PROC_DISK_PHYS_WRITE] INTEGER, [dml_PROC_DISK_PHYS_WRITE_RATE] REAL, [dml_PROC_DISK_PHYS_IO_RATE] REAL, [dml_PROC_MEM_RES] REAL, [dml_PROC_MEM_SHARED_RES] REAL, [dml_PROC_MEM_VIRT] REAL, [dml_PROC_MEM_DATA_VIRT] REAL, [dml_PROC_MEM_STACK_VIRT] REAL, [dml_PROC_PAGEFAULT] INTEGER, [dml_PROC_PAGEFAULT_RATE] REAL, [dml_PROC_MINOR_FAULT] INTEGER, [dml_PROC_MAJOR_FAULT] INTEGER, [dml_PROC_MEM_LOCKED] REAL, [dml_PROC_DISK_SUBSYSTEM_WAIT_PCT] REAL, [dml_PROC_DISK_SUBSYSTEM_WAIT_TIME] REAL, [dml_PROC_PRI_WAIT_PCT] REAL, [dml_PROC_PRI_WAIT_TIME] REAL, PRIMARY KEY (InstanceId, CollectionTime)) WITHOUT ROWID ;
  10. .timer on
  11. .mode csv
  12. .import /home/navin/oa_nvn/process.csv hp_table1
  13. [root@centosnavin sqlite-autoconf-3080801]# sync
  14. [root@centosnavin sqlite-autoconf-3080801]# ./sqlite3 hptest.db < sqlite3commands.txt
  15. [root@centosnavin sqlite-autoconf-3080801]# du -sh hptest.db
  16. 14G     hptest.db
  17. [root@centosnavin sqlite-autoconf-3080801]# time ./sqlite3 hptest.db "select count(*) from hp_table1; "
  18.  
  19.  
  20. 115349845
  21.  
  22. real    26m56.435s
  23. user    0m1.591s
  24. sys     0m21.262s
  25. [root@centosnavin sqlite-autoconf-3080801]# echo "pragma page_size; " | ./sqlite3
  26. 65536  [ tried with default page size also ]
  27. [root@centosnavin sqlite-autoconf-3080801]#
  28.  
  29.  
  30.  
  31. POSTGRES: [ Same machine, same FS disk , similar load, RAM 4G]
  32.  
  33. DB size is around 34.2 GB.
  34.  
  35. bash-4.2$ time psql -c "select count(*) from dml_Scope__Process; "
  36.   count  
  37. -----------
  38. 115349845
  39. (1 row)
  40.  
  41.  
  42. real    4m28.946s
  43. user    0m0.001s
  44. sys     0m0.004s
  45. -bash-4.2$
  46.  
  47. The table is named differently. The data is same. I think postgres SQL is around 2 times the size of sqlite3 database but as you see it is like more than 6 times faster.
  48.  
  49. I also did another experiment. I created this table and did a vaccum and then the select count(*) in sqlite3 was around 2 mins.
  50.  
  51. When I create an index manually after the table is loaded (imported from csv), select count(*) in sqlite3 was within 30 to 40 secs.
  52.  
  53. I'm  stuck here how to go about achieving better speeds without always creating index after inserting data. ?
Advertisement
Add Comment
Please, Sign In to add comment