How Artificial Intelligence <Currently> Works

--

1| Introduction

This article provides a brief but thorough introduction to how AI <currently> works. The writer follows tradition by introducing AI by comparing it to a human. In particular, the writer draws comparisons with the cognitive structure, sensory capabilities, and learning techniques of humans.

2 | Artificial Neural Networks (ANNs)

The human brain comprises of a network of neurons that processes information. Since the 1940s, AI scientists have been working on replicating the structure and the functions of the human brain. Their intention was to manufacture similar cognitive capabilities for AI [1]. This AI research subfield is known as ‘Artificial Neural Networks’ (ANNs). ANNs also refer to the technology itself.

Initially, the development of ANNs was slow because of weak computer processing power. With the advent of the Fourth Industrial Revolution (4IR), technology has been developing along with the processing power of computers. Consequently, AI researchers have been able to develop advanced ANNs [2].

Trending AI Articles:

1. Deep Learning Book Notes, Chapter 1

2. Deep Learning Book Notes, Chapter 2

3. Machines Demonstrate Self-Awareness

4. Visual Music & Machine Learning Workshop for Kids

ANNs are currently capable of solving many business problems such as customer behaviour prediction, data validation, risk management, and sales forecasting [3]. In fact, AI researchers believe that they are close to developing ANNs that match human intelligence. As a result, investment in AI companies has increased. In July 2019, Microsoft invested one billion US dollars in Elon Musk’s company, Open AI, which aims to match and surpass human cognitive capabilities [4].

AI research also investigates the development of memory-retention capabilities for AI. At present, advanced ANNs are capable of retaining short memories [5]. ANNs can thus only use the memories for one task at a time [6]. As basic as this may seem, ANNs are performing impressive tasks. For example, Google’s DeepMind is ANN that AI researchers trained with basic information about family relationships. DeepMind stored the family relationship details in its memory, then was asked to identify a family member . DeepMind was presented with a question similar to ‘Who is Freya’s maternal great uncle?’. It was able to answer this question with ease [7].

3 |The Sensory Capabilities of Artificial Intelligence

AI scientists use the blueprints of human sensory capabilities to create a model for AI sensory capabilities. Examples of relevant human sensory capabilities include hearing, vision, speech, and comprehension.

3.1| Speech Recognition

First, humans can recognise that someone is speaking to them and identify who the speaker is by the timbre of their voice. Recognition starts with the act of speaking, then the corresponding acts of hearing, processing, and identifying. In AI research, this is the subfield of speech recognition, that is, the ability of a computer to recognise the particular nuances of a human’s voice [8].

Speech recognition works similar to how a child would identify their parents and learn a language. The child would initially hear words spoken by their parents, absorb information about their timbre, then his/her brain would form patterns and connections for future reference [9]. Similarly, speech recognition technology would ‘hear’ by using microphones for real-time recognition or recordings of speech for later recognition. After hearing, the technology trains itself to identify and distinguish between different voices by forming patterns and connections.

An example of speech recognition technology is the Google Assistant that can be instructed to perform tasks by voice command. A human could instruct Google Assistant to unlock his/her mobile phone [10] or to create a restaurant reservation [11]. Another example is the voice-to-text transcription software, Otter AI. Otter AI can transcribe recordings of conversations (e.g. interviews, lecturers, meetings) and identify then separate each speaker according to the timbre of his/her voice [12]. However, there are dangers to speech recognition technology, particularly when it comes to cyber security. For example, voice-activated mobile phone security has become almost useless because AI can clone the voices of humans with an alarming degree of accuracy [13].

Speech recognition technology has a complex history of development due to the initial weakness of computer processing power and insufficient data. However, with advances in processing power and the increase in access to big data, it has become increasingly intelligent. The work of South African speech recognition experts speaks to the exciting developments in the technology. These experts have developed speech recognition models that can understand the nuances of South African English accents for transcription of speech [14].

The commercial relevance of this technology is the potential to create massive time and cost efficiencies. For example, advertisers may want to know whether the advertisements that they pay for are broadcasted on radio in time slot of their choice. The advertisements may be ‘live reads’, that is, scripted advertisements read ‘live’ on radio by presenters. Normally, humans would have to listen to hundreds of hours of radio stations with an advertisement schedule, then try to identify whether the advertisement has been broadcasted. This process is cumbersome and expensive. However, AI voice-to-text transcription software like Otter AI can be used to transcribe the live reads instantaneously along with the metadata such as date and time stamps. The final transcription can be used to verify that the paid-for advertising was actually broadcasted with time and cost savings.

3.2 | Natural Language Processing (NLP)

Secondly, computers excel at learning from number-filled spreadsheets of data; however, humans generally communicate with words, not numbers [15]. Computers need to understand and communicate in human language to interact with humans. This subfield of AI research is called ‘Natural Language Processing’ (NLP). NLP achieves the goal of understanding and communicating with humans through programming AI to understand and apply rules of syntax (grammar), semantics (the meaning of words), and pragmatics (context and subtext) [16]. An example of an NLP technology is Grammarly, an AI-powered writing assistant that automatically detects grammar, spelling, punctuation, word choice, and style mistakes [17]. Grammarly’s algorithms flag issues in the written text and suggest corrections based on context and a range of writing styles [18].

3.3| Computer Vision

Finally, humans can generally see with their eyes and process what they see. ‘Computer Vision’ is the subfield of AI research that focuses on enabling computers to interpret and understand the visual world [19]. The visual world consists of image data such as images, videos (a series of image frames), and three-dimension objects [20]. This data can be fed and processed in real-time by cameras or processed after the fact by image data uploaded into the computer vision system. Thus, ‘[a]t an abstract level, the goal of computer vision problems is to use the observed image data to infer something about the world’ [21].

An example of Computer Vision technology is Facebook’s facial recognition software. This software can analyse the details of human faces in uploaded photographs such as the distance between their eyes, nose, and other facial features [22]. In Patel v Facebook, a recent case dealing with whether Facebook users can sue Facebook for consent issues related to facial recognition software, the USA Ninth Circuit Court of Appeals articulated that ‘[o]nce a face template of an individual is created, Facebook can use it to identify that individual in any of the other hundreds of millions of photos uploaded to Facebook each day, as well as determine when the individual was present at a specific location’ [23]. It stands to reason that this technology is incredibly powerful.

4 | Humanoid Artificial Intelligence

Computer scientists have created AI that look like humans by using robot technology. These AI-powered robots are known as humanoid robots [24]. By illustration, Sophia is an AI humanoid robot that was recently granted citizenship in Saudi Arabia [25]. The inventors of Sophia argue that AI, like Sophia, need to develop a social relationship with humans to assimilate human qualities such as compassion and ethics [26]. In the writer’s view, the inventors’ reasoning is particularly important when AI like Vital or Alicia T enter corporate leadership, since they would need to assimilate to corporate governance values such as ethical leadership.

AI Humanoid robots are not however limited to corporate leadership roles. The demand for AI humanoid robots is high across starkly different industries. On the one hand, various nations are developing AI humanoids for their armies [27]. On the other hand, the adult industry has created AI humanoid sex robots for companionship [28]. Consequently, AI is penetrating almost every industry.

In the next section, the writer considers how AI learns.

5 | How Artificial Intelligence Learns

As mentioned elsewhere, the AI development of AI is centred on mimicking human behaviour. Since humans have learning techniques to achieve intelligence, AI must also have its learning techniques.

5.1 | Machine Learning (ML)

Machine Learning (ML) is the main learning technique of AI. ML achieves intelligence through algorithms trained with large data sets. With ML, AI can learn from patterns in data, then make predictions based on what it has learned [29].

ML forms part of the everyday lives of humans. The most basic example is Google Search. When a Google Search user searches for a given topic via the search engine, Google will return with the results that are most relevant to the user based on his/her search history. In this example, Google uses the user’s search history as the data on which to train the ML algorithms. The search history data may for example indicate that the user is a coffee lover, so when he/she searches for the word ‘Java’, websites relevant to coffee will appear first. In contrast, if the search history indicates that the user is a computer programmer, when the programmer searches for ‘Java’, websites related to the computer coding language ‘java code’ will appear in the results [30].

5.2 | Deep Learning (DL)

‘Deep Learning’ (DL) is a subset of ML inspired by the structure of the human brain. Consequently, ANNs are relevant to DL. As ANNs become more sophisticated, they are programmed to learn more challenging problems. Deep learning techniques specifically aim to replicate the learning development pathways of humans with a focus on the visual or symbolic perception instead of datasets [31]. A simple example of the application of DL is an image classifier such as Facebook’s facial recognition technology. Facebook integrates DL into their algorithms with the images of users as the visual source. The DL system recognises patterns in the facial features of a particular user. Once the system has recognised the face, it can autodetect a picture of the user uploaded by user anywhere in the world.

6 | Conclusion

Understanding how AI works is challenging because of how interdisciplinary the field of AI research is. To grasp AI as a whole, the writer suggests breaking down the informing fields into its components, then learning how they work individually. After that, the reader should consider AI as a whole.

7 | References

[1] Dacombe J ‘An introduction to Artificial Neural Networks (with example)’ Medium 23 October 2017 available at https://medium.com/@jamesdacombe/an-introduction-to-artificial-neural-networks-with-example-ad459bb6941b (accessed 2 September 2019).

[2] Dacombe J ‘An introduction to Artificial Neural Networks (with example)’ Medium 23 October 2017 available at https://medium.com/@jamesdacombe/an-introduction-to-artificial-neural-networks-with-example-ad459bb6941b (accessed 2 September 2019).

[3] Shah J ‘Neural Networks for Beginners: Popular Types and Applications’ Medium 16 November 2017 available at https://blog.statsbot.co/neural-networks-for-beginners-d99f2235efca (accessed 12 December 2018).

[4] Cuthbertson A ‘Elon Musk’s AI Project to Replicate the Human Brain Receives $1 Billion from Microsoft’ Independent 23 July 2019 available at https://www.independent.co.uk/life-style/gadgets-and-tech/news/elon-musk-ai-openai-microsoft-artificial-intelligence-funding-a9016736.html?fbclid=IwAR34gco7ouUfXbhw2fd2rLzTx7_gtKjiIWwoM9Al1V4PZdzDcEmCZvfrJ78 (accessed 5 August 2019).

[5] ANNs capable of memory are referred to as ‘Recurrent Neural Networks’ (RNNs). For a brief introduction to RNNs, see generally Banerjee S ‘An Introduction to Recurrent Neural Networks’ Medium 23 May 2018 available at https://medium.com/explore-artificial-intelligence/an-introduction-to-recurrent-neural-networks-72c97bf0912 (accessed 1 August 2019).

[6] Engelking C ‘An Artificial Neural Network Forms Its Own Memories’ Discover Magazine 13 October 2016 available at http://blogs.discovermagazine.com/d-brief/2016/10/13/artificial-neural-network-memories/#.XXpD55MzbaY (accessed 12 January 2019).

[7] Engelking C ‘An Artificial Neural Network Forms Its Own Memories’ Discover Magazine 13 October 2016 available at http://blogs.discovermagazine.com/d-brief/2016/10/13/artificial-neural-network-memories/#.XXpD55MzbaY (accessed 12 January 2019).

[8] Lynn J ‘Can We Talk: Speech Recognition Technology is Changing Your Relationship with Your Computer’ (1999) 14 Commercial Law Bulletin 14–7.

[9] Globalme ‘Speech Recognition Technology Overview’ available at https://www.globalme.net/blog/the-present-future-of-speech-recognition (accessed 10 September 2019).

[10] ‘The Google Assistant is a virtual assistant powered by artificial intelligence and developed by Google that is primarily available on mobile and smart home devices. Unlike Google Now, the Google Assistant can engage in two-way conversations.’ Oloo V ‘How to unlock your phone with your voice using Google Assistant’ Dignited 8 September 2018 available at https://www.dignited.com/34613/unlock-smartphone-voice-google-assistant/ (accessed 7 June 2019).

[11] Lumb D ‘Google Assistant Can Now Make Vocal Restaurant Reservations in 43 States’ Techradar 6 March 2019 available https://www.techradar.com/news/google-assistant-can-now-make-restaurant-reservations-in-43-states (accessed 10 September 2019).

[12] Su J ‘CEO Tech Talk: How Otter.ai Uses Artificial Intelligence To Automatically Transcribe Speech To Text’ Forbes 18 June 2019 available at https://www.forbes.com/sites/jeanbaptiste/2019/06/19/ceo-tech-talk-how-otter-ai-uses-artificial-intelligence-to-automatically-transcribe-speech-to-text/#76d8d1c38729 (accessed 31 July 2019).

[13] Cole S ‘Deep Voice Software Can Clone Anyone’s Voice With Just 3.7 Seconds of Audio’ Vice 7 March 2018 available at https://www.vice.com/en_us/article/3k7mgn/baidu-deep-voice-software-can-clone-anyones-voice-with-just-37-seconds-of-audio (accessed 12 May 2019).

[14] Kamper H & Niesler TR ‘The Impact of Accent Identification Errors on Speech Recognition of South African English’ 110 South African Journal of Science 63–9.

[15] Raval S ‘Natural Language Processing’ YouTube 26 March 2019 available at https://www.youtube.com/watch?v=bDxFvr1gpSU (accessed 3 September 2019).

[16] Gour R ‘What is Natural Language Processing in Artificial Intelligence?’ Medium 19 March 2019 available at https://medium.com/@rinu.gour123/what-is-natural-language-processing-in-artificial-intelligence-b13dc4aa1c81 (accessed 3 September 2019).

[17] Marr B ‘The Amazing Ways Google And Grammarly Use Artificial Intelligence To Improve Your Writing’ Forbes 12 November 2018 available at https://www.forbes.com/sites/bernardmarr/2018/11/12/the-amazing-ways-google-and-grammarly-use-artificial-intelligence-to-improve-our-writing/#46c98ded3bb0 (accessed 3 September 2019).

[18] Hill S ‘How Grammarly & Google are Using Artificial Intelligence for Flawless Writing’ Big Data Made Simple 13 December 2018 available at https://bigdata-madesimple.com/how-grammarly-google-are-using-artificial-intelligence-for-flawless-writing/ (accessed 3 September 2019).

[19] Raval S ‘Learn Computer Vision’ YouTube 14 July 2019 available at https://www.youtube.com/watch?v=FSe_02FpJas (accessed 3 September 2019).

[20] Raval S ‘Learn Computer Vision’ YouTube 14 July 2019 available at https://www.youtube.com/watch?v=FSe_02FpJas (accessed 3 September 2019).

[21] Brownlee J ‘A Gentle Introduction to Computer Vision’ Machine Learning Mastery 5 July 2019 available at https://machinelearningmastery.com/what-is-computer-vision/ (accessed 5 September 2019).

[22] Ingber S ‘Users Can Sue Facebook Over Facial Recognition Software, Court Rules’ NPR 8 August 2019 available at https://www.npr.org/2019/08/08/749474600/users-can-sue-facebook-over-facial-recognition-software-court-rules (accessed 4 September 2019).

[23] Patel v Facebook Inc №18–15982 (9th Cir. 2019) p 17.

[24] Hunt V Smart Robots: A Handbook of Intelligent Robotic Systems 1 ed (1985) Springer, USA 1–30.

[25] Stone Z ‘Everything You Need to Know about Sophia, The World’s First Robot Citizen’ Forbes 7 November 2017 available at https://www.forbes.com/sites/zarastone/2017/11/07/everything-you-need-to-know-about-sophia-the-worlds-first-robot-citizen/ (accessed 2 January 2019).

[26] Hanson DF ‘Why Sofia Was Made’ YouTube 4 February 2018 available at https://www.youtube.com/watch?v=h4-_2b9zPiA (accessed 9 May 2019).

[27] Cole S ‘U.S. Army preps for future of AI on the battlefield’ Military Embedded Systems available at http://mil-embedded.com/articles/u-s-army-preps-for-future-of-ai-on-the-battlefield/ (accessed 12 September 2019).

[28] Morris A ‘Prediction: Sex Robots Are The Most Disruptive Technology We Didn’t See Coming’ Forbes 25 September 2018 available at https://www.forbes.com/sites/andreamorris/2018/09/25/prediction-sex-robots-are-the-most-disruptive-technology-we-didnt-see-coming/#21a943cd6a56 (accessed 12 May 2019).

[29] Surden H ‘Machine Learning and Law’ (2014) 89 Washington Law Review 89–93.

[30] Google Cloud Platform ‘What is Machine Learning? (AI Adventures)’ You Tube 24 August 2017 available at https://www.youtube.com/watch?v=HcqpanDadyQ (accessed 17 May 2018).

[31] Karanasiou AP & Pinotsis DA ‘A Study into the Layers of Automated Decision-Making: Emergent Normative and Legal Aspects of Deep Learning’ (2017) 31(2) International Review of Law, Computers & Technology 174.

Don’t forget to give us your 👏 !

--

--