Scraping-problem-links-and-problem-statements-from-coding-website

Web scraping is a technique to exrtract large amount of data from websites whereby the data is extracted and saved to a local file of computer or to a database. It is the project to scrap the questions links with their statements from the coding website. This website contains two sections of questions of python and java .Each of them contains different sections with coding problems.

Basic steps for web scraping

Load the document from which you want to scrap the data.
Parse or to interpret the document to make the searching possible.
Simply extract the data from the web pages.
Transform the data into useful format.

Process to access the html code of live site

user ---> Request ---> Server -----> Response ----->Html code

Beautiful soup is a python library which is used to puling the data from the html pages $ xml files. It provides efficient searching and modification techniques.

HTML code is like a Tree of tags and Beautiful soup is used to parse the tree for extracting the data from these tags.

                                          <html>
                                       
                         <head>                          <body>
                         
              <meta>                <title>      <p>                 <p>

Required packages:

requests (pip install requests)
Beautiul Soup (pip install beautifulsoup4)
UserAgent (pip install fake-useragent)
xlsxwriter (pip install xlsxwriter)
xlrd (pip install xlrd)

References:

Below are the link of java and python sections of coding website which is used to scrap the data:

https://codingbat.com/java

https://codingbat.com/python

Output

Scraped_questions.xlsx is the ouput file which contains two worksheet one for java section and another for python section.Each worksheet contains section link, questions links of their respective sections and problem statements of all questions.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
PythonCode.py		PythonCode.py
README.md		README.md
Scraped_questions.xlsx		Scraped_questions.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scraping-problem-links-and-problem-statements-from-coding-website

Basic steps for web scraping

Process to access the html code of live site

Required packages:

References:

Output

About

Uh oh!

Releases

Packages

Languages

gangwar-107/Scraping-problem-links-and-problem-statements-from-coding-website

Folders and files

Latest commit

History

Repository files navigation

Scraping-problem-links-and-problem-statements-from-coding-website

Basic steps for web scraping

Process to access the html code of live site

Required packages:

References:

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages