Here are presented projects on data analytics during Yandex Practicum Bootcamp(2022-2023)
Author of the projects - Alesiankova Darya
Project name | Description | Used libraries/methods |
---|---|---|
1. Analysis of user behavior in Unnecessary things app | In application "Unnecessary things" users sell their unnecessary things. The project describes: what events influence the conversion to the target action, what application usage scenarios stand out, how time differs between common user events | Pandas, Matplotlib, Seaborn, Plotly, event analytics, product metrics, statistical hypothesis testing, data visualization |
2. Analysis of application business indicators. Сohort analysis | Despite a huge investment in advertising, ProcrastinatePRO has been losing money over the past few months. The challenge is to figure out why. During the analysis from app ProcrastinatePRO: сalculated various metrics, used cohort analysis: LTV, CAC, Retention rate, DAU, WAU, MAU, etc, drew conclusions from the data. | Pandas, Matplotlib, Seaborn, cohort analysis, unit economy metrics |
3. Evaluation of the results and correctness of the A/B test | The company conducted the A/B test "recommender_system_test". Based on the dataset with user actions, terms of reference and auxiliary datasets, it is necessary to evaluate the results of the A/B test and evaluate the correctness of its implementation. | Pandas, Matplotlib, SciPy, A/B testing, testing of statistical hypotheses |
4. Research of the catering market in Moscow | Research of the public catering market in Moscow based on open data in order to find the best location for opening a new establishment. Preparing a presentation for investors | Pandas, Numpy, Matplotlib, Seaborn, Plotly, Folium, data visualization |
5. Analysis of user behavior in a mobile app for a food startup | An analysis of the mobile application users behavior was made for a startup that sells food. During the project: the sales funnel was studied, investigated the results of an A/A/B experiment to determine the optimal font in the application. Goal of the project: Analyze the sales funnel to understand where users "get stuck" and analyze an A/A/B test to determine the optimal font for the application | Pandas, Matplotlib, Math, Numpy, Scipy, Plotly |
6. A/B test. Hypothesis testing for an online store | Together with the marketing department a list of hypotheses was prepared to increase revenue for a large online store. During the project: 9 hypotheses were prioritized to increase revenue by ICE and RICE frameworks. Then I analyzed the results of the A/B test, built graphs of cumulative revenue, average check, conversion by group, and then calculated the statistical significance of differences in conversions and average checks for raw and cleaned data. Based on the analysis, I decided that it was not advisable to continue the test. | Pandas, Matplotlib, SciPy, A/B testing, testing of statistical hypotheses |
7. Creating custom event dashboard for news aggregator | It is necessary to make a dashboard showing the interaction of users with cards. In the course of work, I unloaded data from the database, made a dashboard and a presentation | Pandas, Sqlalchemy, Tableau, product metrics |
8. Analysis of tariff plans | The clients of the federal cellular operator "Megaline" are offered two tariff plans: "Smart" and "Ultra". To adjust the advertising budget, the commercial department wants to understand which tariff brings in more money. | Pandas, Scipy, Numpy, Matplotlib, Seaborn, Math |
9. Identification of patterns that determine the success of computer games | In the process of the project, it is necessary to identify patterns that determine the success of the game. This will allow placing a bet on a potentially popular product and planning advertising campaigns for 2017 for the Strimchik online store, which sells computer games all over the world. After processing the data, I performed an exploratory data analysis. Considered the dependence of sales on regions, genres, platforms. Conducted an analysis of promising platforms. I determined a portrait of the main customer by genre, platform and region. Tested hypotheses about the equality of user ratings of genres and several platforms using statistical analysis | Pandas, Matplotlib, Numpy, Scipy, descriptive statistics, testing of statistical hypotheses |
10. Banking Data Analysis | The customer is the credit department of the bank. The objective of the project is to find out whether the marital status and the number of children of the client affect the fact of repaying the loan on time. Data from the bank - statistics on the solvency of customers. The results of the research will be taken into account when building a credit scoring model - a special system that assesses the ability of a potential borrower to return a loan to a bank. | Pandas, Matplotlib, Seaborn |
11. Parameters that affect the cost of housing in St. Petersburg | It is necessary to determine the main parameters of apartments, calculate the cost of apartments, determine what a typical apartment looks like in St. Petersburg and the Leningrad Region. On this basis, the customer will create an automatic system to track anomalies and fraudulent activity. During the research, I determined the parameters of various apartments and made recommendations for cutting off anomalous options. | Pandas, Matplotlib, Seaborn |
12. Data analysis of Yandex.Music service | It is necessary to compare the musical preferences of Yandex.Music users in Moscow and St. Petersburg. Test three hypotheses. The comparison showed differences and similarities in user preferences. Only one of the three hypotheses was confirmed. | Pandas, Matplotlib, Seaborn |
13. Analysis of the database of the service for reading books | The task is to analyze the database. The database contains information about books, publishers, authors, and user reviews of books. This data will help to formulate a value proposition for a new product. | PostgreSQL |