- Read whole document before making any changes.
- Don't use relative imports.
- Indent with 4 spaces. Use double quotes by default. Follow PEP8 naming conventions:
- Follow DRY and SOLID principles.
- Keep in mind that your code should be testable e.g. https://youtu.be/XVZpi7VJ_ws.
- If your code has state, you must wrap it with class:
# Bad
x = "something"
def foo(arg):
x = arg
# Better
class Foo:
def __init__():
self.x = "something"
def foo(self, arg):
x = arg- Fork this repo to make your own changes and asign @ijimiji as collaborator.
- Add autogenerated files
*.pyc, caches, database file in.gitignore. - Don't use force push. Use reverts or roll back code manually.
- The main branch has to be
master, main development branch has to bedev. Approved changes should be only made todevbranch. - Each step has to be done in separate feature branch e.g.
feature/initial,feauture/parser. I advice checking outdev, rebasing into previous feature branch and creating new branch at this point. - You don't have to make changes in branch, if it would cause merge conflicts. You can create new branch for fixes.
- You can split each step in it's logical parts and create separate branches.
- When feauture is ready, you have to create pull request
dev <- feauture/fooand asign @ijimiji as reviewer. - Commits have be informative e.g. no "update", "fix" commits, describe made changes. Use the same capitalization throughout your project.
# Bad commit messages
upd
fix
foo
try again
upd2
fix of fix
# Better commit messages
Remove unused imports
Implement web parser class
Add additional checks to parserImplement a web parser CLI utility.
- Use https://python-poetry.org/ to manage your dependencies.
- You can come back to this step later, after you finish next steps, but in this case you have to provide
requirements.txtforpip.
- Create
configmodule withConfigclass that implementsread()andget()methods. read()parses a config stored in plain text file in following format into dictionary:
key1 = value
key2 = foo
# Has to produce
# {
# "key1": "value",
# "key2": "foo"
# }
Configrecieves filename of a file to be parsed throuh constructor.get()returns parsed dictionary.- Wrap
readwith cache decorator to avoid multiple fs reads.
Other modules should use your config like
config = Config(filename="cfg.txt")
config.read()
url = config.get()["password"]
- Peek a news website to parse.
- Add
beautifulsoup4as a dependecy to your project https://beautiful-soup-4.readthedocs.io/en/latest/. - In
parsermodule createParserclass that recieves url to be parsed from with constructor. - In
parsermodule createArticleclass that contains:- Title
- Abstract
- Image preview URL (if provided)
Articlehas to use@dataclassdecorator.Parserhas to implementparse()method that returns a list ofArticleobjects created with data parsed from the website.- Use
bs4to parse the data. - You can add additional packages if you need XML support.
- Caller has to provide url with config file.
- Add
SQLAlchemyas a dependecy to your project. https://docs.sqlalchemy.org/en/14/intro.html#installation - Use
sqliteas an SQL database. https://realpython.com/python-sqlite-sqlalchemy/ - Create
Articlemodel withsqlalchemy. Read docs for more details. - Create
ArticleDatabaseclass indatabasemodule that has to:- Ensure in constructor that appropriate database and table is created, create if not present.
- Implement
savemethod that accepts a list ofArticleand saves them in the database. - Implement
getmethod that returns a list ofArticlecontained in the database.
- Dockerfile for the project.
- Formatters:
isortblackpre-committhat runsisortandblackautomatically https://pre-commit.com/hooks