Jan. 27, 2021, 8:59 a.m.
Dec. 24, 2020, 11:34 a.m.
When I start working on a machine learning project my first impulse is always to try to fit some models. At the end of the project I always remember how important exploratory data analysis is, and wish I had remembered sooner. Even on things where EDA doesn't seem necessary it usually is.
I have been working on an instance detection challenge and what use will EDA be on a dataset of annotated images ? It turns out a lot. After doing some EDA I found that many of the annotations were wrong, and simply by correcting them I was able to greatly increase my model's performance.
In addition, by doing some EDA on the predictions from a fitted model I was able to identify some common causes of errors and attempt to address them.
Dec. 17, 2020, 12:43 p.m.
I've had good luck with multi-scale training for image detection so I wanted to try it for classification of images that were of different sizes with objects at differing scales. I found some base code here , but this is based on PyTorch datasets, not on ImageFolders. I wrote some code to extend it to ImageFolders, which is in the below gist :
Nov. 26, 2020, 6:39 a.m.
I have some free Azure student credit so I decided to try to use some Azure VMs to train some of my models yesterday. I soon realized that a student account does not include a quota for any GPU more powerful than a K80 and with a student account there is no way to request increased quota. However, the student account does include a quota for "low priority instances" or spot instances, which are pre-emptible. So I set up a spot VM.
On AWS sometimes spot VMs can go for days before being pre-empted. Not so on Azure. I tried about a half dozen times, and no instance ever lasted long enough to complete even half an epoch, or about an hour. I was very disappointed because the spot prices were much better than AWS spot prices. For Azure spot instances you can set a price you are willing to pay, but even setting the price above the on demand price didn't make any difference.
My final complaint about Azure VMs is the shortage of images. AWS has a huge number of images for deep learning so you can basically just start the instance and you are set to go. Azure only has a few such images and they still required considerable configuration and installation of packages, which is made especially difficult by the fact that the instance kept shutting down.
I may use Azure on-demand VMs in the future, but the spot instances were largely useless.