Financial Guidance Generator
The Problem: Some datasets are very hard to make due to a variety of reasons like privacy of personal information. This can be a problem when datasets are becoming increasingly important to the training of different AI models.
The Story: While making my AI companion Duncan Gamabunta, I developed a synthetic data generator for the fine-tuning of a SmolLM 1.7B Instruct model. I thought this dataset was completely useless, but it got more downloads than I expected on huggingface.com, so I thought I would apply this concept to other use cases where there is an actual practical need for synthetically generated datasets. The first project I applied this to was maternal health, in a project that I am calling MHGen (check out that project for more information). Then I decided to apply this concept to niche financial use cases, because grandma and I need to eat.
The Result: This project uses commercially developed AI models like ChatGPT in a python system to generate realistic financial advice on different niche subjects. I was able to fine-tune a model using the dataset and am hoping to use it to provide more cost effective financial assistance to those who can’t afford professional services.
Some of these datasets are made available on my huggingface.com and kaggle.com websites. Larger datasets are available in my store for the training and fine-tuning of different large language models.

