Data Scientist Programmer Mathematician
ngoogstein@gmail.com
I use Google BigQuery as a productive and reliable data warehouse for all our non-personalized data. It is accessible from anywhere in the world and can be easily queried and analyzed using Looker.
I utilize numerous Python scripts to extract data from internal sources. These scripts are run on a dedicated internal server operating Debian Linux, enabling efficient data retrieval and processing.
SQL is a significant part of my work because it is needed both for querying internal databases and as a native language inside BigQuery.
I have developed several useful dashboards that display the current state of our business.
As a mathematician, I take great pride in my diligent work with numbers and data, uncovering intricate patterns. With a dedicated focus on formulating and rigorously testing hypotheses, I utilize Python programming and the powerful BigQuery warehouse. My sincere goal is to advance my career in the fields of Data Science and Machine Learning, continuously enhancing my skills and expertise. I am committed to staying current with industry advancements, striving for excellence in all endeavors.
OPEN TO RELOCATION.
70
Online dashboards
20
Subordinates in my team
5
Years in data analysis
4
Years as a product owner
2
Mathematical books
Orchestrated the implementation of a Business Intelligence (BI) solution leveraging Google Cloud Platform (GCP), resulting in an impressive 50-70% reduction in manual data Extract, Transform, Load (ETL) procedures (incl. web-parsing).
Constructed and configured over 70 dashboards using Looker to deliver comprehensive business metric insights to enhance decision-making in eCommerce.
Innovated and implemented an alert system aimed at improving the likelihood of early problem detection in e-commerce shopping carts.
Conducted AB-testing and other hypothesis testing, used Python, SQL and Google colab to refine and optimize strategies.
Lead a team of 20 individuals, including content managers and analysts, overseeing their tasks, and ensuring seamless collaboration to achieve common goals.
Enhanced Price and Product Management System (MCF), seamlessly integrating and automating matching and pricing processes for all product showcases.
Developed and implemented dashboards showing sales forecasts for key product showcases using Python and the Holt-Winters method.
The course teaches Python skills specifically tailored towards data science, including data manipulation, analysis, and machine learning. It covers important Python libraries used in data science like Pandas, Seaborn, Matplotlib, scikit-learn. It works with real-world datasets to practice statistical techniques and machine learning algorithms for tasks like hypothesis testing and predictive modeling. The course aims to build hard skills in Python and data science, as well as career-readiness for roles like data scientist. It starts with Python basics but progresses to more advanced techniques like supervised learning and projects.
This course provides comprehensive training in data analysis, focusing on practical skills using Python. It begins with interactive exercises and hands-on experience with popular Python libraries such as pandas, NumPy, and Seaborn. By working with real-world datasets, you will develop expertise in data manipulation and exploratory data analysis. As you advance, the course covers essential topics such as data manipulation techniques and data joining. Additionally, you will acquire key statistical skills, including hypothesis testing.
Tableau is a widely used business intelligence (BI) and analytics software trusted by companies like Amazon, Experian, and Unilever to explore, visualize, and securely share data in the form of Workbooks and Dashboards. With its user-friendly drag-and-drop functionality it can be used by everyone to quickly clean, analyze, and visualize your team’s data.
The Spezialization includes:
1. Modernizing Data Lakes and Data Warehouses with Google Cloud
2. Preparing for the Google Cloud Professional Data Engineer Exam
3. Building Batch Data Pipelines on Google Cloud
4. Building Resilient Streaming Analytics Systems on Google Cloud
5. Google Cloud Big Data and Machine Learning Fundamentals
6. Smart Analytics, Machine Learning, and AI on Google Cloud
Use dimensions, measures, filters, and pivots to analyze and visualize data
Create advanced metrics instantaneously with table calculations
Create and share visualizations using Looks and dashboards
Discuss the use of folders and boards in Looker to manage and organize content
Review different methods of data loading: EL, ELT and ETL and when to use what
Run Hadoop on Dataproc, leverage Cloud Storage, and optimize Dataproc jobs
Build your data processing pipelines using Dataflow
Manage data pipelines with Data Fusion and Cloud Composer
June 2000 - November 2003PhD in Mathematics
Russian Academy of Sciences Institute of Applied Mathematical Research
My dissertation was focused on Random Forests and Random Permutations.
September 1994 - June 2000Master's degree in Math
Petrozavodsk State University
I studied various mathematical disciplines including analysis probability theory programming mathematical logic etc.
I have developed the entire functional design of our CMS and ensured its
development and implementation with a team of developers.
This playlist comprises 17 videos, wherein I endeavor to elucidate mathematics as if it were a foreign language. Commencing with level A1, which delves into basic logic, it culminates at level B2, exploring formal arithmetic and set theory.
The book presents the fundamental concepts of classical mathematics, including
group theory, analysis, and geometry.
This is my monograph dedicated to the foundations of mathematics and its basic
ideas and principals.