Create a predictive code for machine learning, studying movements of actions that occurred in B3.
* The code must be built in R
* I have the database from 02/01/2018 to 01/08/2019 of all options of companies traded on b3, this database is made available by b3 itself for machine learning.
* The data entry and exit codes must be well commented so that I understand what is happening.
* I must be able to determine the start and end date for the study of the forecasting models: example the study starts on 9/1/2018 and runs until 11/1/2018 to from then run tests and the prediction itself.
* I must be able to choose the time forecasts of actions: 7 days.
* the 3 models used must pass a performance test that can be the RMSE or the absolute mean error or another traditional model, the important thing is that the two models used for prediction present similar results, showing that adjustments are not necessary.
* the two models that will study the database, should study the average paper price plus some variable. The forecast should be for the average paper price.
* More variables can be used for study, I must be informed of which ones were chosen.
* If the database undergoes any “filter” treatment, I must receive this database after the treatment and a document explaining what was done so that I can replicate it.
* The developer must be able to leave the code running on my computer and for that I will allow remote access to my machine.
The code must have 5 steps:
1 run the XBoost model studying the chosen dates, validating the performance and saving the results to a file on the computer. The results should be the percentage of return expected for each investment option, the accuracy of the model and the expected values for future dates. A print can be built showing the results.
2 run another model that uses a decision tree, studying the chosen dates, validating the performance and saving the results in a file on the computer. The results should be the percentage of return expected for each investment option, the accuracy of the model and the expected values for future dates. A print can be built showing the results.
3 Add the results of each investment option in the two files and debt by 2, obtaining the average value of the results, validate the performance and save the results in a file on the computer. The results should be the percentage of return expected for each function, the accuracy of the model and the expected values for each day.
4 perform the variance calculation for each investment option using the entire database and saving it in a fourth file on the computer. If necessary, I can pass the formula for calculating variance.
5 the computer informs how much the variance oscillates, I choose a value within this range and with that the TOPSIS model will be executed using the variance result of each investment option and the expected return for each investment option obtained by dividing the tree models of decision and xboost, being classified the best returns for the level of variance “risk” accepted.
* Finally, the 6 and 12 best performing options should be printed on the screen showing the expected return for each of these investment options, how much each of the two models got right and how much the division of the two models got right, saving all of these results in a file.