Select Page

Data Science has gained popularity over the past few years but has grown even more during the pandemic. With millions of gigabytes of data being generated every day, companies are now taking the data driven decision making approach to make decisions for the company. Data Scientists, Data Analyst, Data Engineers and Business Analysts are in huge demand to organize, work and pull-out insights from this large data.

The objective of this project is to predict salary based on information collected from these job roles 2 years ago. With this data set I aim to help those like myself who are looking for one of these job roles and try and bring all the needed information in one place so that others could use it too. I also aim to find the trends on how the pay changes based on the company size, location, etc. I are looking to answer the following questions:

• The best jobs and salaries in the desired location across the US

• What kind of companies best fit my skillset?

• What are the companies looking for in an ideal candidate?

• How many job roles are currently available in the market?

Data Description:

The data set consists of 4 CSV files which consist of information related to Data Scientists, Data Analyst, Data Engineers and Business Analyst. Each of these CSV files contain the job role, salary, job description and location for each company. Using matplotlib and seaborn I would be creating useful insights on these profiles and using NumPy, pandas, sklearn and a couple of other packages I would analyze and predict the salary.

  • Platform : Jupyter Notebook
  • Programming Languages : Python
  • Python Libraries : numpy, pandas, matplotlib, plotly, seaborn, numpy, scipy, statsmodels, sklearn, math, wordcloud

 

Key Highlights-

Bar Plot of Underpaid and High-paid Data Scientists in US

Job Titles vs Salaries Errorbar

Salary Variation with “Machine Learning” on Title

Salaries vs Keywords Barplot

Wordmap of Hot Keywords in DS Job Descriptions

Bar Plot of Top 10 States Hiring Data Scientist

Coefficient Bar of Salary Performance against Average

Data Science Job Salary Impact Map (Final Regression Result)