nunosempere.github.io/maths-prog/MachineLearningDemystified/readme.md

# Machine Learning Demystified

Several friends encouraged me to apply to a Data Scientist position at ID Insights, an organization I greatly admire, and for a position which I would be passionate about. Unfortunately, they require Python, and I'm most proficient with R. I decided to apply anyways, but before, I familiarized myself throrougly with numpy, pandas and sklearn, three of the most important libraries for machine learning in Python.

I used a dataset from Kaggle: [Health Care Cost Analysis](https://www.kaggle.com/flagma/health-care-cost-analysys-prediction-python/data), referenced as "insurance.csv" thoughout the code. The reader will also have to change the variable "directory" to fit their needs.

Otherwise, the current files in this directory are:

- [CleaningUpData.py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/CleaningUpData.py). I couldn't work with the dataset directly, so I tweaked it somewhat.
- [AlgorithmsClassification.py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/AlgorithmsClassification.py). As a first exercise, I try to predict whether the medical bills of a particular individual are higher than the mean of the dataset. Some algorithms, like Naïve Bayes, are not really suitable for regression, but are great for predicting classes.
- [AlgorithmsRegression,py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/AlgorithmsRegression,py). I try to predict the healthcare costs of a particular individual, using all the features in the dataset.

## Thoughts on sklearn

##
Create readme.md 2019-10-09 18:53:29 +00:00			`# Machine Learning Demystified`

Update readme.md 2019-10-09 18:58:29 +00:00			`Several friends encouraged me to apply to a Data Scientist position at ID Insights, an organization I greatly admire, and for a position which I would be passionate about. Unfortunately, they require Python, and I'm most proficient with R. I decided to apply anyways, but before, I familiarized myself throrougly with numpy, pandas and sklearn, three of the most important libraries for machine learning in Python.`
Create readme.md 2019-10-09 18:53:29 +00:00
Update readme.md 2019-10-09 18:59:41 +00:00			`I used a dataset from Kaggle: [Health Care Cost Analysis](https://www.kaggle.com/flagma/health-care-cost-analysys-prediction-python/data), referenced as "insurance.csv" thoughout the code. The reader will also have to change the variable "directory" to fit their needs.`
Create readme.md 2019-10-09 18:53:29 +00:00
			`Otherwise, the current files in this directory are:`

Update readme.md 2019-10-09 18:54:28 +00:00			`- [CleaningUpData.py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/CleaningUpData.py). I couldn't work with the dataset directly, so I tweaked it somewhat.`
			`- [AlgorithmsClassification.py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/AlgorithmsClassification.py). As a first exercise, I try to predict whether the medical bills of a particular individual are higher than the mean of the dataset. Some algorithms, like Naïve Bayes, are not really suitable for regression, but are great for predicting classes.`
			`- [AlgorithmsRegression,py](https://github.com/NunoSempere/nunosempere.github.io/blob/master/maths-prog/MachineLearningDemystified/AlgorithmsRegression,py). I try to predict the healthcare costs of a particular individual, using all the features in the dataset.`
Create readme.md 2019-10-09 18:53:29 +00:00
Update readme.md 2019-10-09 18:58:29 +00:00			`## Thoughts on sklearn`

			`##`