Why shouldn’t the main focus of a project be on using complex techniques? For my part there are three predominant reasons, which I’ll explain here.
Reason 1. The business doesn’t care
The primary and most vital reason is that the business doesn’t care! Your stakeholders usually are not concerned with the technical details of your model. Whether you used boosted trees or a neural network, to them, it’s all the identical. What they need to know is how your model helps them achieve their business goals. If the model must be retrained often, you’ll be able to justify your decision to make use of a straightforward model like logistic regression over a neural network since it’s super fast to coach.
Often, the predominant goal of a machine learning model isn’t to achieve 100% accuracy. As an alternative, a machine learning model helps with business processes. Spending an excessive amount of time optimizing the model will delay the time it takes to deliver a working product to the market. It’s higher to create an MVP, ensure it meets the business requirements, and get it into production. It’s essential to take not only performance but in addition interpretability, computation speed, development costs, robustness, and training time into consideration. These aspects are vital too and will be as relevant to business people as performance.
Besides yourself, there are other individuals who care a few complex model and state-of-the-art methods. Those individuals are often researchers or data science colleagues. Should you work too closely with them as an alternative of with the business, you’ll be able to get to the purpose where you suspect modeling is the predominant goal. To beat this, attempt to work closer with business people. Demo your product after every latest feature implementation and ask the business in case your assumptions are correct. Decisions that appear small will be really vital for business people.
Reason 2. A fancy model adds less value than a working MVP
The more time you spend on the model, the less time you might have for good engineering principles, reminiscent of writing modular code, testing, architecture, logging, and monitoring. Setting these items up in way in the beginning saves loads of time later. You’ll be able to easily add latest features to a solid codebase. That is more useful than having a fancy model in a Jupyter Notebook that performs barely higher but doesn’t run in production. One other advantage of a straightforward model is interpretability, which might help persuade stakeholders because they’ll see the predictions make sense.
Especially at first, give attention to making a product that works and has robust code and a well-crafted CI/CD pipeline. This makes it easier to enhance the answer in a while. If the business doesn’t feel the urge to enhance the present solution, you’ll be able to move on to a different project. You didn’t waste your time making a ‘perfect’ model.
What pertains to that is the Pareto principle. It’s a rule that states that 80% of results will be achieved through 20% of our efforts (aka the 80/20 rule). Often, creating a fancy model that performs barely higher than a straightforward model doesn’t fall into the 80% of the outcomes but is a task that is tough and takes loads of time. The complex model is that last hard-to-reach 20% that takes 80% of the trouble. Before you begin, persuade yourself it’s price it.
Reason 3. Complex projects require more maintenance
The more complex the project, the more resources and time are needed to keep up it. Which means that you’ll spend more time fixing bugs, optimizing the model, keeping the info up thus far, and fewer time adding latest features or improving the product. An easy project, alternatively, requires less maintenance, which implies you can spend more time iterating on the MVP and adding latest features to enhance the product.
A crucial thought to take into account is that the very best solution is commonly the only solution that matches the necessities. This could allow you to determine if that deep learning state-of-the-art model is actually well worth the extra work that comes with it! If there are two models that perform equally well, and one is easy and the opposite is complex, go together with the straightforward one.
One example from my work at an organization: I attempted to resolve a scheduling problem with reinforcement learning. It was quite complex, and we were progressing slowly. The business became a bit annoyed and dissatisfied because we couldn’t show good results. Once we switched our solution method to (good old) mathematical optimization, it went much faster! It was less interesting, but we gained the trust of the business and will implement latest features and constraints easily.