Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- import psycopg2
- import string
- import random
- from multiprocessing import Pool
- '''
- Machine:
- 1 core
- 1 GB RAM
- Improvements:
- #1 Removed the autocommit in populate_database
- While the option is needed to create a database, it is not to work within a given database.
- I removed the option (conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
- and explicitely added a commit to the connection further down (conn.commit()).
- This limits the amount of transaction and makes the queries significantly faster.
- 29 seconds -> 9 seconds
- #2 Added a Pool
- The operations done on the different databases are independent and can be executed simultaneously.
- I've added a Pool to run create_and_populate_database for each db simultaneously.
- Python Global Interpreter Lock allows better results even on a single threaded machines as it uses subprocesses and not threads.
- Here is the code:
- p = Pool(DATABASES_COUNT)
- p.map(create_and_populate_database, db_names)
- 9 seconds -> 4 seconds
- '''
- DATABASES_COUNT = 5
- DB_NAME_SIZE = 20
- TABLES_COUNT = 4
- TABLE_NAME_SIZE = 10
- ROWS_COUNT = 500
- STRING_NAME_SIZE = 8
- WORKERS = 0
- PG_HOST = "127.0.0.1"
- PG_USER = "postgres"
- def generate_random_string(chars):
- return ''.join(random.choice(string.ascii_lowercase + string.digits) for _ in range(chars))
- def populate_database(database_name):
- conn = psycopg2.connect("host='%s' dbname='%s' password='none' user='%s'" % (PG_HOST, database_name, PG_USER))
- cursor = conn.cursor()
- for i in range(0, TABLES_COUNT):
- table_name = 't' + generate_random_string(TABLE_NAME_SIZE)
- print("CREATING TABLE %s" % table_name)
- cursor.execute("CREATE TABLE %s (id serial, name varchar(%s) not null)" % (table_name, STRING_NAME_SIZE))
- for j in range(0, ROWS_COUNT):
- cursor.execute("INSERT INTO %s (name) VALUES ('%s')" % (table_name, generate_random_string(STRING_NAME_SIZE)))
- conn.commit()
- conn.close()
- def create_and_populate_database(database_name):
- print("Creating database %s" % database_name)
- conn = psycopg2.connect("host='%s' dbname='postgres' password='none' user='%s'" % (PG_HOST, PG_USER))
- conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
- cursor = conn.cursor()
- cursor.execute("CREATE DATABASE %s" % (database_name))
- conn.close()
- populate_database(database_name)
- db_names = []
- for i in range(0, DATABASES_COUNT):
- db_names.append('d' + generate_random_string(DB_NAME_SIZE))
- if __name__ == '__main__':
- p = Pool(DATABASES_COUNT)
- p.map(create_and_populate_database, db_names)
Add Comment
Please, Sign In to add comment