Skip to content

Conversation

lilioid
Copy link
Contributor

@lilioid lilioid commented May 29, 2020

This depends on Map-Data/tileserver-mapping#10

I tried reproducing the steps from https://github.com/Map-Data/regiontileserver/blob/master/import_data.sh as much as possible with the addition of coordinating files to tileserver-mapping.

@lilioid lilioid requested a review from Akasch May 29, 2020 11:54
def _create_postgres_db(self):
# create new postgresql cluster
print_stage('Creating PostgreSQL cluster')
db_dir = os.path.join(self.working_dir, 'pg_data')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will do unexpected things if one starts two containers to process in parallel with the same dirs, maybe also use the tilecoordinates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

db_dir = os.path.join(self.working_dir, 'pg_data')
subprocess.run(['rm', '-rf', db_dir], check=True)
os.makedirs(db_dir)
subprocess.run(['pg_createcluster', PG_VERSION, 'main', '--start', '--datadir', db_dir], check=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prpably should edit the configuration of the database and allow more RAM use

print_stage("Importing data with osm2pgsql")
subprocess.run([
'su', 'postgres', '-c',
f'osm2pgsql --slim --hstore-all -C 3000 '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the 3000 a good choice for RAM usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what the original script used

f'-P {self.db_port} '
f'-U postgres '
f'--number-processes {max(1, os.cpu_count() - 2)} '
f'{os.path.join(self.working_dir, self.pbf_file_name)}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use unchecked tables as we will drop it later on?

with open(file_path, 'wb') as f:
subprocess.run(['su', 'postgres', '-c',
f'pg_dump -p {self.db_port} -d {self.db_dbname} --format custom'],
check=True, cwd=self.out_dir, stdout=f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add compression?

print_stage('Uploading PostgreSQL dump to tileserver-mapping')
file_path = os.path.join(self.working_dir, 'db.pg_dump')
self.api.upload_sql_dump(self.tile, file_path)
subprocess.run(['rsync', file_path, self.out_dir], check=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is uploaded i would not put it into local storage, it would only waste space

def _upload_dump(self):
print_stage('Uploading PostgreSQL dump to tileserver-mapping')
file_path = os.path.join(self.working_dir, 'db.pg_dump')
self.api.upload_sql_dump(self.tile, file_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it retry upload?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants