Innodata — Sitting Down With an Expert

Sitting Down With an Expert

Conversations with Industry Experts

Bringing Computer Vision to Small Businesses

A Conversation with Sparrow Computing's Ben Cook

“There has never been a better time for small businesses to use computer vision, thanks to the convergence of several technological trends. For a relatively small investment, businesses can obtain anything from purpose-built computer vision systems that include high definition cameras and integrated GPU computing resources, to more DIY-type setups cobbled together from off-the-shelf security cameras and cloud-based storage.

We’re constantly meeting entrepreneurs who have innovative ideas about how to deliver value to customers, and they just need a little help bridging the gap from “I think a camera could help me do this better” to “problem solved.” It’s a really rewarding experience.”

— Ben Cook

Ben Cook is a Machine Learning Engineer and Founder of Sparrow Computing – a company whose mission is to bring computer vision to every entrepreneur who needs it. His work is a blend of custom software development and advising machine learning teams to ensure that Sparrow’s clients receive world-class computer vision solutions uniquely designed for their specific problem. Before joining Sparrow, Ben led the Applied Machine Learning team at Hudl, a sports software company that ingests petabytes of video every year.

Innodata sat down with Ben to explore how computer vision has evolved to the point where it is accessible, affordable, and even invaluable to small businesses.

Innodata: Thanks for sitting down with us. First, what are the top use cases for computer vision in small businesses right now? How can computer vision make businesses work better?

BC: The key word here is “small businesses.” There are a handful of giant tech companies where modern computer vision has been pioneered. But most small businesses don’t have the resources to devote to R&D at that scale — for them, computer vision is all about solving real-world problems, the specifics of which are as unique as each small business. Some of the best use cases involve automatic monitoring of physical assets. Whether it’s inventory or capital equipment, small businesses can use computer vision to intelligently track some of their most important resources. For decades, video surveillance has enabled small business owners to “see” their assets, but computer vision takes that a step further by extracting meaningful data from that video feed: counting the number of items on shelves to get on-demand inventory numbers, continuous inspection for wear and tear and defects, or maybe a custom visual sensor to track the progress of a critical process.

Innodata: Could you describe a few of your projects and customer experiences so we can see how this plays out in the real world?

BC: One of our clients, Fast Forward AI, is using computer vision to automatically inspect power transmission lines. For that project, we built a computer vision system that detects and tracks utility poles in images captured by a vehicle-mounted camera. We were able to leverage a suite of technology that included panoramic cameras, thermal cameras, and precision GPS so that this particular small business owner can offer a comprehensive inspection and mapping service to utility companies that replaces slow, manual processes.

In another instance, we helped an agricultural technology company build a mobile app for digital recordkeeping of livestock. Our computer vision system automatically extracts pertinent information (e.g., breed, color, etc.) from an image of an animal, so that ranchers can instantly access data on their most valuable assets.

Innodata: How have technology and external conditions changed to make computer vision more accessible and affordable for smaller companies? How resource-intensive is computer vision implementation?

BC: There has never been a better time for small businesses to use computer vision, thanks to the convergence of several trends: the sophistication of open-source computer vision frameworks (e.g. PyTorch, TensorFlow, Keras), the prevalence of high-quality cameras, and the availability of advanced computing resources (i.e. GPUs). The Internet of things has been an emerging phenomenon for several years now, and connected cameras are becoming increasingly commonplace, even for consumer usage (e.g. vehicle dash-cams and smart doorbells), so it’s easy to see why more and more small businesses are inspired to use image and video data in whole new ways.

Innodata: What are the most important factors in getting computer vision right? What types of errors have the largest consequences?

BC: For small businesses, computer vision is all about solving real-world problems. It’s not necessarily about chasing the bleeding edge of algorithm development. In order to solve real-world problems, the two most important factors are 1) having high quality data, and 2) developing a production-ready system. Anyone who’s trained a machine learning model knows that high accuracy in training is all well and good, but the real challenge comes when that model is “released into the wild,” when it sees brand-new data for the very first time. Instead of trying to optimize the model hyperparameters for the training data, we think it’s important to build a system that makes re-training and re-deploying a model fast and easy. Additionally, that same computer vision system should be capable of operating within a business-relevant context. Demo-ing a model is one thing, but computer vision is most valuable to businesses when it’s delivering decision-quality data.

We do typically have some control over types of errors and often false negatives are more detrimental than false positives, but that’s really a decision that we want entrepreneurs to make depending on the needs of the system. There will never be a perfect computer vision system, so we actually spend a lot of time talking with our customers to clearly communicate the error characteristics of the model (i.e., in what situations can it be trusted to perform well, and when is it more likely to have errors) so they can optimize business operations to take advantage of highly accurate predictions and mitigate the damage from erroneous output.

Innodata: How do you collect and prepare data for computer vision model training? What levels of accuracy do your models typically achieve?

BC: Our clients usually come to us with a bunch of images or videos, eager to extract information from them. We always start by understanding the business problem and building a small proof of concept to see if a solution is technically feasible.

After that, we start thinking seriously about how to build a large, high-quality dataset: how to sample the data, how to label it, and how to speed up annotation with pre-trained models. This is where a reliable data annotation partner really adds value to our operations. We trust them to help us transform the data into training-ready annotations, which allows us to move quickly through the algorithm development lifecycle. For example, I already mentioned that one of our clients uses computer vision to automatically inspect power transmission poles. The initial prototype for this product was developed in a rural setting, and our computer vision model performed well after training on data from this setting. However, the next phase of the product roll-out involved a demo of the inspection capability in a more urban environment. Thanks to the annotation team, we were able to quickly correct/edit some model-derived predictions, get the model re-trained, and push a new version of the inspection model that performed well in both urban and rural settings.

Finally, we help our clients deploy the model and integrate it with their existing software so that it can solve real business problems.

How we measure the accuracy of our algorithms varies from one system to the next because different metrics make sense in different contexts. Our goal is always to achieve good enough performance to solve the underlying business problem, but we never guarantee a certain level of accuracy because there are factors outside our control that impact the result. It can be tempting at times to chase every last percentage-point of accuracy, but we’ve found that while flashy performance metrics attract a lot of attention, at the end of the day the most important feedback on the model is going to come from the customer.

Innodata: Do you use synthetic data for model training? If so, what types?

BC: Currently, we always use real images and videos pulled from the environment that we are trying to teach the model to understand. But clearly, there are cases where it’s difficult to generate all the examples you want the model to learn. As synthetic data gets more realistic and more widely accessible, we may start to include it in our process.

Innodata: What computer vision ethics or privacy challenges have you faced, and what measures have you taken to overcome and/or prevent them?

BC: Computer vision systems are becoming increasingly powerful, and that technology can absolutely be used for nefarious purposes. Currently, we focus exclusively on supervised learning use cases (e.g. no deep fakes). But even within the realm of supervised models, we’re wary of technology like facial recognition and license plate readers and we don’t work on any applications where the goal is to surveil people.

Innodata: What are the greatest obstacles to computer vision implementation, i.e., what do you think prevents small businesses from considering computer vision solutions? How would you address these issues?

Even though computer vision is more accessible than ever to small businesses, the best entrepreneurs are laser-focused on their domain-specific problems and don’t always know what’s possible with modern computer vision. Even when they do know what’s possible, they usually don’t have the budget to build an entire team in-house. We address these issues by optimizing for efficiency and focusing on practical results. Unless a system can produce value for our small business clients, computer vision will be stuck in the realm of a “cool technological gimmick” instead of a “worthwhile business practice.” Sparrow Computing exists to ensure small businesses and entrepreneurs get real results from computer vision.

Innodata: Where do you see computer vision going in the next 10 years? What new or surprising use cases might we encounter going ahead?

I think we’ll see computer vision expand to more and more industries and use cases. The idea of self-driving cars and other autonomous vehicles is very much a part of the current zeitgeist, but I don’t think it’ll be long until computer vision becomes a common technology in other areas of everyday life. The recent World Cup tournament was a good example of that — TV audiences got to see computer vision at work in determining “off-sides” and other real-time sports analytics. Instances like this will only expand people’s understanding of how computer vision works, and its potential use cases are really only constrained by the imagination of the small business and entrepreneurs that will try to bring new ideas to market.

Sitting Down With an Expert

Conversations with Industry Experts

Bringing Computer Vision to Small Businesses

A Conversation with Sparrow Computing's Ben Cook

Ready to Scale and Train Your AI Models?

Solve Your Toughest Data Engineering Challenges Using Artificial Intelligence and Human Expertise