AI Part 5: Practical Implementation

Neural AI can be thought of as semi-controlled programming using simulated neurons inspired by nature, rather than pure symbolic constructs like logic, rules, and ideas. It’s also a kind of tacit admission that Mother Nature is way better at solving some kinds of problems than programmers are.
We can combine these neurons into patterns that accomplish various goals, such as:
- Computer vision: Drone inspection of solar panel damage
- Time-series models: Identifying patterns in sequential data, such as temperature, vibration data, or circuit voltage to detect problems
- Natural language processing: Document classification, sentiment analysis, computer code generation
- Object Detection and segmentation: Automated meter reading, vegetation management, safety compliance monitoring
- Resolution enhancement: Improving low quality images, such as old data or satellite data
- Optical Character Recognition: Loading old hand-written records into a database, equipment name-plate reading
- Reinforcement Learning Agents: Electrical grid balancing, storage optimization, inventory management
Neural AI should be used in cases that play to its strengths, where the chances and costs of failure are low (if the AI misses a broken solar panel, it will likely catch it tomorrow and the cost is negligible; less so in medical contexts or power-grid operations, to say nothing of large-scale weapon systems).
The best way to look at neural AI is as a programming technique, like the wealth of symbolic AI that came before it and the even larger corpus of ordinary programming techniques.
Each of these bring their own strengths and weaknesses and none of them is a silver bullet. Neural AI can do some things much faster than the others and indeed, can do some things that are basically impossible in the others.
Lost in the hype is that the reverse is also true. There is no circumstance in which it makes sense to write, say, a file system with neural AI. File systems must work 100% of the time, quickly and efficiently. This is antithetical to neural AI. The technologies are complementary and any successful implementation of any solution will likely involve some mix of these approaches.
Internal AI Projects
At a high level, AI initiatives don’t look notably different from any ordinary IT project; the main phases, goals and deliverables will be familiar to any product, project or IT manager. Where AI projects diverge is in data requirements, infrastructure, skills, and maintenance.
Managers should not adopt the mindset that the latest technology is the best technology. Newer technology is often more expensive, less reliable and less fit to purpose for many requirements.
Equations for relativistic gravity are more accurate than their older Newtonian equivalents. They are also orders of magnitude more difficult and complex to compute. Therefore, most work with gravity is still done using Newtonian equations: cheaper, easier, and more than accurate enough for the vast bulk of scientific and engineering needs.
By the same token, a more appropriate lens through which to view technology selection is to choose the least resource-intensive one which fully meets the project criteria, where resource costs may vary from organization to organization based on existing skill sets and installed technology base.
Generally speaking, a project should use the simplest and most efficient technology that will meet the project goals. In order of ascending cost/complexity:
- Traditional code
- Statistical AI
- Symbolic AI
- Non-Generative Neural AI
- Generative Neural AI
Most organizational needs will still be met by traditional code, but there will certainly be cases where some form of AI is the best or only fit. Whether programmers use AI tools to build their code or AIs is a separate topic.
Data Requirements
The data requirements necessary for a successful AI project will almost certainly be novel and unfamiliar to the majority of IT departments. In fact, AI initiatives often spend up to 70-80% of their resources preparing the data for use with the model.
In fortunate circumstances, the AI can be trained with unstructured data: feeding a large number of PDFs into a neural network may be all that is required to incorporate new knowledge into a model, particularly for tasks like document search or general knowledge enhancement.
However, in most cases a great deal of effort needs to be devoted to providing the model with the data it needs, whether that is a framework for learning by trial and error, a program for synthesizing data or, in the worst and quite common case, annotating data by hand.
In the annotation case, taking internal company data, extracting the meaningful portions, converting them into a usable format and then developing suitable questions is very laborious.
Generally speaking, data fed to an AI during training must be carefully prepared in a kind of question/answer format. For example:
- "Summarize this article: [article text]" → "Here's a summary: [summary]"
- "Translate to French: Hello, how are you?" → "Bonjour, comment allez-vous?"
- "What is the capital of France?" → "The capital of France is Paris."
- <X-Ray or CT scan results> → “Spiculated Mass”
- <A set of cardholder metrics> → “Credit card fraud”
You are feeding the AI examples of the ideal answers you'd like to see it produce for various kinds of queries. A human might only need a few examples to get the idea, but a neural AI will need hundreds or thousands; human learning is very, very efficient. The quality of that input will directly determine the quality of the output.
Thankfully, as noted below, this task can often be distributed among a large number of non-specialists and learning improvements are being made rapidly.
Infrastructure
AIs, particularly the statistical and neural varieties, require a robust technology infrastructure. Even most well-run IT departments will find the need to invest in new hardware.
First and foremost, AI requires GPUs, for both training and inference (producing output). It’s well known that NVIDIA’s hardware is the most desirable, principally because of how well supported it is in the software ecosystem (CUDA); it just works. AMD is making steady improvements and may be worth considering, particularly if hardware budget is limited. Exotic options also exist for high-end requirements, such as Cerebras.
Aside from standard concerns like memory capacity and processing speed, one factor that will trip up many IT departments is the need for very high memory speeds, particularly with the GPUs. As neural models get larger, the expensive GPUs increasingly sit unused, waiting for new data to arrive. This is analogous to the old problem of CPUs sitting idle waiting for new data from hard disks, which led to the development of modern solid-state storage.
AI work not only requires large amounts of storage, but it is often the case that it needs to be very fast as well. Consequently, storage will likely be the highest cost after GPUs.
In addition, network upgrades will likely be needed to support storage clusters and to feed the systems hosting the GPUs.
Staff Skills
The skills necessary to underwrite a successful AI project fall into three main categories: leadership, implementation, and data support groups.
Leadership
Ideally, all but the most straightforward AI projects should have one staff member who is capable of reasoning about AI models, what they’re doing and how they could be modified to achieve a better result, as either the project leader or the principal subject matter expert.
People with the necessary skills and experience to take on this role are still very rare in the marketplace. If they can be found at all, they will be very expensive and this will remain the case for the foreseeable future. As such, companies may wish to evaluate existing staff or external candidates without formal AI experience and retrain them to handle AI projects.
In the absence of long-term direct experience, the single most important attribute to look for is abstract reasoning ability (also known as fluid intelligence). Such thinkers are minimally dependent on prior learning; a huge benefit in working with novel systems that are not broadly understood.
Such people are also likely to be adept at reasoning about complex, abstract systems through abductive reasoning, which is finding the simplest and most likely explanation for a given set of facts. It is the skill of successfully navigating a complex landscape in the dark.
Fluid intelligence can be assessed using a standard psychological test such as Raven’s Advanced Progressive Matrices, and of the tests mentioned here, it should be weighted the most strongly. While the author is unaware of any tests that directly focus on abductive reasoning, the “Watson-Glaser Critical Thinking Appraisal” and “Torrance Tests of Creative Thinking” (TTCT) tests are both excellent proxies and simultaneously evaluate other useful traits.
A number of hard skills are desirable in candidates as they are all directly applicable to working with AI and will reduce the time necessary to make them effective:
- Linear Algebra (neural AI *IS* linear algebra)
- Calculus
- Probability & Statistics
- Graph Theory
- Programming
- Databases
- IT Infrastructure
- Domain expertise of the organization’s business
Producing functional AI is a completely separate discipline to programming, and there is little to no overlap between the two, even if programming is useful as a supporting skill. Managers should not automatically assume that programmers will make good AI developers.
Finally, ordinary HR suitability criteria such as enthusiasm for the role continue to be important as well.
Implementation
AI models are inevitably surrounded by an ecosystem of custom code, compute, storage and networking. The extent of that ecosystem scales with the project requirements, from a single laptop to a campus of data centers and beyond.
As such, implementation teams see a lot of overlap with traditional IT skillsets, such as programming, infrastructure and their modern hybrid, DevOps.
Programmers will have the highest re-skilling burden. As noted above, there is little overlap between AI skills and traditional programming skills. The staff will need a long lead time to adjust to the new requirements (“How do I even begin to debug this?”).
In this context, AI can be seen as domain knowledge. If a programmer were going to write a program analyzing wireless spectrum, they should be fluent in that domain. They are ultimately the ones who encode the project’s ideas into the fledgling system.
Programming staff should possess as many of the traits noted in the “Leadership” section above as possible. Development-oriented leadership candidates who did not ultimately get selected for those positions are best placed in these roles.
Infrastructure and DevOps members will not necessarily need new skills per se, but if they are working on a larger neural AI project, experience with high-performance systems will be a significant benefit (such as managing sizable storage clusters or high-throughput data pipelines).
Data Support
After the implementation team decides what data is needed, designs a framework for organizing it and marshals the data for processing, it falls to the data support people to actually annotate the data.
The level and type of skill necessary is entirely dependent on the domain of the project. The pool of appropriate candidates for building a medical imaging analysis AI is limited to radiologists. The pool of appropriate candidates for making an AI that scans pictures for common forms of damage to walls (cracks, stains, etc) is virtually everyone. The key is that the pool should err on the side of being large in order to improve processing speed.
Accuracy is key and, in the case of manual annotation, project owners should expect to classify their data set at least three times and by different people. In such cases, the annotation process should be designed to either be binary (yes/no, true/false) or a simple one word answer (“strawberry”, “car”, “wrench”). The final answer is determined by the best-of-three from the staff annotations.
Accuracy is further improved by processing many samples of the same thing with slight variations. To teach a neural AI what a strawberry looks like, several hundred pictures of strawberries of varying sizes, types, and shapes, from different angles and under varying lighting conditions should be used. As noted above, each of those pictures needs to be annotated three times. AI is the very definition of the phrase “garbage-in, garbage out”.
Indeed, the foundational ImageNet had a corpus of 14 million images evaluated by ordinary people around the world three times using Amazon’s Mechanical Turk. Each pictured object (tomato, strawberry) had several hundred samples within the data set.
Ultimately what data is required and who is eligible to contribute annotations is specific to an individual project and under the judgement of its leadership. It may be just a few people in the organization or something where every staff member can contribute.
Maintenance
While AI is subject to the same break-fix cycle as traditional software, it also suffers from a phenomenon called “model drift”.
It is sometimes necessary to adjust traditional software due to changes in its operating environment. Perhaps a new feature of the operating system comes out, perhaps a feature is removed, perhaps a new standard is released, like a new version of USB. In all of these cases, software needs to be adjusted to stay current or else it will fail; it is no longer closely aligned with its environment.
While this is also true for AI software, models often require additional re-training on a regular basis to keep them in line with their environment. Small changes will not necessarily break the model in the same way that traditional software fails (hard error/crash), but will cause the model become a little less accurate.
However, the number of potential changes and the number of uncontrollable sources of such change can be very large, leading to the requirement for some level of constant upkeep.
Some applications, such as looking for particular patterns in power flows, may never need re-training, while others, such as a model that tracks the evolving narratives in world events, will need constant updating, even to the point where it cannot be updated fast enough.
It should also be noted that the cost of correcting this drift scales very quickly. It is best to avoid letting drift build up as the cost to fix a 3% variance may be an order of magnitude higher than fixing a 1% variance. A 5% variance may require complete retraining.
This issue needs to be evaluated as part of a project’s feasibility study, and will be different for every project. While not wholly unique to AI projects, it represents a new problem that ordinary IT teams may not be ready for.
Sample Project Flow
What follows is a process outline for bringing a new AI product to market, but can also be used for internal projects by eliminating the commercial and promotional elements.
- Phase I: Project Candidate Identification
- Set Internal Expectations
- Steering Team Selection
- Ideation
- Including stakeholder input
- Simple Feasibility Assessment
- Feasibility
- Data Availability
- Market Requirement
- Market Acceptance
- Regulatory Hurdles
- Profitability
- Organizational Readiness
- Phase II: Project Candidate Evaluation
- In-depth feasibility assessment of top candidates
- Proof of concept
- Decision Point: Project Go/No-Go
- Phase III: Planning
- Technology Selection
- Product Design
- Go-to-Market Planning
- Budgetary Adjustments
- Phase IV: Execution
- Engineering
- Data Marshaling/Preparation
- Software Development
- Model Training
- Testing
- Load Testing
- Model Risk
- User Acceptance Testing
- Pilot/Beta
- Engineering
- Decision Point: Launch Readiness Assessment
- Phase V: Market Preparation
- Marketing
- Business Development
- Sales
- PR
- Critical Point: Release
- Phase VI: Maintenance
- Ongoing refinement/re-training
- Bug fix releases
- Document Lessons Learned
- Decision Point: Project Future (one of)
- Continued Development
- Maintenance Mode
- Decommission
- Return to Phase III - (similar to OODA Loop)