8000 MarcosBrum (Marcos Brum) · GitHub
[go: up one dir, main page]

Skip to content
View MarcosBrum's full-sized avatar

Block or report MarcosBrum

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MarcosBrum/README.md

Portfolio of Data Science Projects by Marcos Brum

Hi there 👋, welcome to my portfolio. Here you will find links to the Data Science projects I have been working on. The purpose of these projects is to demonstrate my skills in solving business problems using techniques and tools of Data Science.

Marcos Brum

I am a data scientist experienced in developing business solutions, from the understanding of the business problem to interpreting the model results in terms of business value.

Skills:
Artificial Intelligence: Machine/Deep Learning; Quantum Machine Learning; Large Language Models; Natural Language Processing
Science: Physics: Quantum Physics, Relativistic Physics; Mathematics: Functional Analysis, Differential Equations
Programming Languages: Python; JavaScript; C++
Technologies: Qiskit; Git; Docker; LaTeX
Languages: Portuguese, English, German

Contacts

LinkedIn

ResearchGate

Data Science Projects

Rossmann Sales forecast

The stores of the Rossmann drugstore chain need to be restored and the CEO needs to decide how much is going to be dedicated to the restoration of each one. To support this decision, the Analytics team is asked to present a sales forecast for each store during a period of six weeks, alongside with the total income expected in the chain. This forecast also informs the CEO which store is able to account for its own restoration with the income within this period.

The gross expected income of the majority of stores is in the range between R$5000.00 and R$22000.00. The chain is expected to obtain R$289,822,112.00, with best and worst case scenarios of R$290,808,412.17 and R$288,835,860.27, respectively. These scenarios are predicted using statistical errors (mean absolute percentage error).

Insurance Cross-sell

A health insurance company intends to offer its customers a new product, a vehicle insurance. In order to achieve this purpose efficiently, it gathered some information about their customers and asked if they would be interested in purchasing a new vehicle insurance. This information was passed on to a Data Science Consulting office.

The office delivered a report informing, among all features gathered, the most relevant ones and the probability of purchase from each customer. Qualitatively, the predicted probability provides a lift gain of 2.5, thus reducing the sales cost to 40%.

Scientific Papers Classification

ArXiv is a public repository of scientific papers where researchers and students from anywhere in the world can find the latest results in many disciplines ranging from Natural Sciences to Mathematics and Computer Science. The papers are categorized by discipline and (possibly multiple) subdiscipline. The platform also displays each paper's abstract without the need to download the file.

The categorization of a new paper is important for authors to make sure their research will reach the intended audience and for the readers to find the most relevant works in their interest area. Presently the categorization process is the authors' sole responsibility. This fact raises some questions:

  1. Is the category chosen for a research paper the most appropriate?
  2. Is it possible to make a reasonable prediction of a paper's category given only a summary or it's abstract?

In this project we display how a Large Language Model can be leveraged to help classify a scientific paper. We will employ the transfer learning technique using a pretrained transformer model to predict paper's categories.

Popular repositories Loading

  1. Rossmann_sales_prediction Rossmann_sales_prediction Public

    Sales forecast for the stores of the european drugstore chain Rossmann.

    Jupyter Notebook 1

  2. scientific_production scientific_production Public

    This repository contains the list of my academic production - scientific papers and Graduate courses lectured.

    TeX

  3. MarcosBrum MarcosBrum Public

    This is my profile

  4. health_insurance_cross_sell health_insurance_cross_sell Public

    Jupyter Notebook

  5. LNN LNN Public

    Forked from IBM/LNN

    A `Neural = Symbolic` framework for sound and complete weighted real-value logic

    Python

  6. scientific_paper_classification scientific_paper_classification Public

    Jupyter Notebook

0