0% found this document useful (0 votes)

30 views11 pages

Advanced Neural Network Techniques - Elements of AI

The document discusses advanced neural network techniques, focusing on convolutional neural networks (CNNs) and generative adversarial networks (GANs). CNNs are highlighted for their ability to efficiently detect image features while reducing the amount of training data needed, and GANs are introduced as a method for generating realistic images through the competition of two neural networks. Additionally, it touches on the rise of large language models (LLMs) and their applications, particularly in generating human-like text responses.

Uploaded by

KenKen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

Advanced Neural Network Techniques - Elements of AI

Uploaded by

KenKen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Elements of AI

Course overview Neural networks Advanced neural network techniques

III. Advanced neural network techniques

In the previous section, we have discussed the basic ideas behind most neural network methods:
multilayer networks, non-linear activation functions, and learning rules such as the
backpropagation algorithm.

They power almost all modern neural network applications. However, there are some
interesting and powerful variations of the theme that have led to great advances in deep
learning in many areas.

Convolutional neural networks (CNNs)

One area where deep learning has achieved spectacular success is image processing. The
simple classifier that we studied in detail in the previous section is severely limited – as you
noticed it wasn’t even possible to classify all the smiley faces correctly. Adding more layers
in the network and using backpropagation to learn the weights does in principle solve the
https://course.elementsofai.com/5/3 1/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

problem, but another one emerges: the number of weights becomes extremely large and
Elements of AI
consequently, the amount of training data required to achieve satisfactory accuracy can
become too large to be realistic.
Course overview Neural networks Advanced neural network techniques
Fortunately, a very elegant solution to the problem of too many weights exists: a special kind
of neural network, or rather, a special kind of layer that can be included in a deep neural
network. This special kind of layer is a so-called convolutional layer. Networks including
convolutional layers are called convolutional neural networks (CNNs). Their key property
is that they can detect image features such as bright or dark (or specific color) spots, edges in
various orientations, patterns, and so on. These form the basis for detecting more abstract
features such as a cat’s ears, a dog’s snout, a person’s eye, or the octagonal shape of a stop
sign. It would normally be hard to train a neural network to detect such features based on
the pixels of the input image, because the features can appear in different positions,
different orientations, and in different sizes in the image: moving the object or the camera
angle will change the pixel values dramatically even if the object itself looks just the same to
us. In order to learn to detect a stop sign in all these different conditions would require vast
of amounts of training data because the network would only detect the sign in conditions
where it has appeared in the training data. So, for example, a stop sign in the top right corner
of the image would be detected only if the training data included an image with the stop sign
in the top right corner. CNNs can recognize the object anywhere in the image no matter
where it has been observed in the training images.

Note

Why we need CNNs

CNNs use a clever trick to reduce the amount of training data required to detect objects in different
conditions. The trick basically amounts to using the same input weights for many neurons – so that all of
these neurons are activated by the same pattern – but with different input pixels. We can for example have a
set of neurons that are activated by a cat’s pointy ear. When the input is a photo of a cat, two neurons are
activated, one for the left ear and another for the right. We can also let the neuron’s input pixels be taken

https://course.elementsofai.com/5/3 2/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

from a smaller or a larger area, so that different neurons are activated by the ear appearing in different
Elements
scales (sizes), of we
so that AI can detect a small cat’s ears even if the training data only included images of big
cats.
Course overview Neural networks Advanced neural network techniques

The convolutional neurons are typically placed in the bottom layers of the network, which
processes the raw input pixels. Basic neurons (like the perceptron neuron discussed above)
are placed in the higher layers, which process the output of the bottom layers. The bottom
layers can usually be trained using unsupervised learning, without a particular prediction
task in mind. Their weights will be tuned to detect features that appear frequently in the
input data. Thus, with photos of animals, typical features will be ears and snouts, whereas in
images of buildings, the features are architectural components such as walls, roofs,
windows, and so on. If a mix of various objects and scenes is used as the input data, then the
features learned by the bottom layers will be more or less generic. This means that pre-
trained convolutional layers can be reused in many different image processing tasks. This is
extremely important since it is easy to get virtually unlimited amounts of unlabeled training
data – images without labels – which can be used to train the bottom layers. The top layers
are always trained by supervised machine learning techniques such as backpropagation.

https://course.elementsofai.com/5/3 3/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Elements of AI

Course overview Neural networks Advanced neural network techniques

Do neural networks dream of electric sheep? Generative adversarial networks

(GANs)

Having trained a neural network on data, we can use it for predictions. Since the top layers of
the network have been trained in a supervised manner to perform a particular classification
or prediction task, the top layers are really useful only for that task. A network trained to
detect stop signs is useless for detecting handwritten digits or cats.

A fascinating result is obtained by taking the pre-trained bottom layers and studying what
the features they have learned look like. This can be achieved by generating images that
activate a certain set of neurons in the bottom layers. Looking at the generated images, we
can see what the neural network “thinks” a particular feature looks like, or what an image
with a select set of features in it would look like. Some even like to talk about the networks
“dreaming” or “hallucinating” images (see Google’s DeepDream system).

https://course.elementsofai.com/5/3 4/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Note Elements of AI

Be careful
Course overview with Neural
metaphors
networks Advanced neural network techniques

However, we’d like to once again emphasize the problem with metaphors such as dreaming when simple
optimization of the input image is meant – remember the suitcase words discussed in Chapter 1. The neural
network doesn’t really dream, and it doesn’t have a concept of a cat that it would understand in a similar
sense as a human understands. It is simply trained to recognize objects and it can generate images that are
similar to the input data that it is trained on.

To actually generate real looking cats, human faces, or other objects (you’ll get whatever you
used as the training data), Ian Goodfellow, a researcher at Google Brain at the time, proposed
a clever combination of two neural networks. The idea is to let the two networks compete
against each other. One of the networks is trained to generate images like the ones in the
training data – it is called the generative network. The other network’s task is to separate
images generated by the first network from real images from the training data – this one is
called the adversarial network. These two combined then make up a generative adversarial
network or a GAN.

The system trains the two models side by side. In the beginning of the training, the
adversarial model has an easy task to tell apart the real images from the training data and
the clumsy attempts by the generative model. However, as the generative network slowly
gets better and better, the adversarial model has to improve as well, and the cycle continues
until eventually the generated images are almost indistinguishable from real ones. The GAN
tries to not only reproduce the images in the training data: that would be a way too simple
strategy to beat the adversarial network. Rather, the system is trained so that it has to be able
to generate new, real-looking images too.

https://course.elementsofai.com/5/3 5/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Elements of AI

Course overview Neural networks Advanced neural network techniques

The above images were generated by a GAN developed by NVIDIA in a project led by Prof
Jaakko Lehtinen (see this article for more).

Could you have recognized them as fakes?

The Rise of Large Language Models (LLMs)

https://course.elementsofai.com/5/3 6/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

As mentioned above, convolutional neural networks (CNNs) reduce the number of learnable
Elements of AI
weights in a neural network so that the amount of training data required to learn all of them
doesn't grow astronomically large as we keep building bigger and bigger networks. Another
Course overview Neural networks Advanced neural network techniques
architectural innovation, besides the idea of a CNN, that currently powers many state-of-
the-art deep learning models is called attention.

Attention mechanisms were originally introduced for machine translation where they can
selectively focus the attention of the model to certain words in the input text when
generating a particular word in the output. This way the model doesn't have to pay attention
to all of the input at the same time, which greatly simplifies the learning task. Attention
mechanisms were soon found to be extremely useful not only in machine translation.

In 2017, a team working at Google published the blockbuster article "Attention is All You
Need", which introduced the so-called transformer architecture for deep neural networks.
Unless you have been living on a desert island or on an otherwise strict media diet, you have
most likely already heard about transformers (the neural network models, not the toy
franchise). It's just that they may have been hiding inside an acronym: GPT (Generative
Pretrained Transformer). As the title of the article by the Google team suggests, transformers
heavily exploit attention mechanisms to get the most out of the available training data and
computational resources.

The most widely noted applications of transformers are found in large language models
(LLMs). The best known ones are OpenAI's GPT-series, including GPT-1 released in June
2018 and GPT-4 announced in March 2023, but no giant platform company wants to miss
out: Google picks model names from Sesame street and published BERT (Bidirectional
Encoder Representations from Transformers) in October 2018, while Meta joined the party a
bit later in February 2023, picking a name inspired by the animal world, LLaMA (Large
Language Model Meta AI). And it's not just the platform companies that are driving the
development: universities and other research organizations are contributing open source
models with the goal of democratizing the technology.

https://course.elementsofai.com/5/3 7/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Note Elements of AI

What's
Course in an
overview LLM?
Neural networks Advanced neural network techniques

LLMs are models that given a piece of text like "The capital of Finland is" predicts how the text is likely to
continue. In this case, "Helsinki" or "a pocket-sized metropolis" would be likely continuations. LLMs are
trained on large amounts of text such as the entire contents of the Wikipedia or the CommonCrawl dataset
that, at the time of writing this, contains a whopping 260 billion web pages.

In principle, one can view LLMs as basically nothing but extremely powerful predictive text
entry techniques. However, with some further thinking, it becomes apparent that being able
to predict the continuation of any text in a way that is indistinguishable from human
writing, is (or would be) quite a feat and encompasses many aspects of intelligence. The
above example which is based on the association between the words "the capital of Finland"
and "Helsinki" is an example where the model has learned a fact about the world. If we'd be
able to build models that associate the commonly agreed answers to a wide range of
questions, it could be argued that such a model has learned a big chunk of so-called "world
knowledge". Especially intriguing are instances where the model seems to exhibit some level
of reasoning beyond memorization and statistical co-occurrence: currently, LLMs are able to
do this in a limited sense and they can easily make trivial mistakes because they are based
on "just" statistical machine learning. Intensive research and development efforts are
directed at building deep learning models with more robust reasoning algorithms and
databases of verified facts.

Note

https://course.elementsofai.com/5/3 8/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

ChatGPT: AI for the masses

Elements of AI
A massive earthquake occurred in San Francisco on November 30, 2022. It was so powerful that hardly a
person on the planet was unaffected, and yet, no seismometer detected it. This metaphorical "earthquake"
Course overview Neural networks Advanced neural network techniques
was the launch of ChatGPT by OpenAI. Word of the online chatbot service that anyone could use free of
charge quickly spread around the world and after mere five days, it had more than a million registered users
(compare this to the five years that it took the Elements of AI to reach the same number), and in two
months, the number of signups was 100 million. No other AI service, or probably any service whatsoever,
has become a household name so quickly.

The first version of ChatGPT was based on a GPT-3.5 model fine tuned by supervised and reinforcement
learning according to a large number of human-rated responses. The purpose of the finetuning process was
to steer the model away from toxic and incorrect responses that the language model had picked up from its
training data, and towards comprehensive and helpful responses.

It is not easy to say what caused the massive media frenzy and the unprecedented interest
towards ChatGPT by pretty much everyone, even those who hadn't paid much attention to
AI thus far. Probably some of it is explained by the somewhat better quality of the output,
due to the finetuning, and the easy-to-use chat interface, which enables the user to not only
get one-off answers to isolated questions, like any of the earlier LLMs, but also maintain a
coherent dialogue in a specific context. In the same vein, the chat interface allows one to
make requests like "explain this to a five year old" or "write that as a song in the style of Nick
Cave." (Mr Cave, however, wasn't impressed [BBC]). In any case, ChatGPT succeeded in
bumping the interest in AI to completely new levels.

It remains to be seen what are the real "killer apps" for ChatGPT and other LLM-based
solutions. We believe the most likely candidates are ones where the factual content comes
from the user or from another system, and the language model is used to format the output
in the form of language (either natural language or possibly formal language such as
program code). We'll return to the expected impact of ChatGPT and other LLM-based
applications in the final chapter.

https://course.elementsofai.com/5/3 9/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

After completing Chapter 5 you should be able to:

Elements of AI

Course overview Neural networks Advanced neural network techniques

Explain what a neural network is and where they are being successfully used

Understand the technical methods that underpin neural networks

Please join the Elements of AI community to discuss and ask questions about this chapter.

Correct answers

You reached the end of Chapter 5! 44 %

Exercises completed

17 /25

Next Chapter

Implications
Start →

https://course.elementsofai.com/5/3 10/11
3/21/25, 11:13 AM Advanced neural network techniques - Elements of AI

Elements of AI

Course overview Neural networks Advanced neural network techniques

Introduction to AI

Building AI

About

FAQ

Terms and Conditions

My profile Sign out

https://course.elementsofai.com/5/3 11/11

III. Advanced Neural Network Techniques
No ratings yet
III. Advanced Neural Network Techniques
6 pages
Liu 2018 J. Phys. Conf. Ser. 1087 062032
No ratings yet
Liu 2018 J. Phys. Conf. Ser. 1087 062032
8 pages
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES-www - Jntumaterials.co - in
No ratings yet
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES-www - Jntumaterials.co - in
26 pages
Lect 2 Common Architectural Principles of Deep Networks
No ratings yet
Lect 2 Common Architectural Principles of Deep Networks
20 pages
DL Unit-4
No ratings yet
DL Unit-4
26 pages
A Comprehensive Introduction To Convolutional Neural Networks: A Case Study For Character Recognition
No ratings yet
A Comprehensive Introduction To Convolutional Neural Networks: A Case Study For Character Recognition
10 pages
Unit 4 Deep Learning
No ratings yet
Unit 4 Deep Learning
27 pages
Demystifying Deep Convolutional Neural Networks - Adam Harley (2014) CNN PDF
No ratings yet
Demystifying Deep Convolutional Neural Networks - Adam Harley (2014) CNN PDF
27 pages
CNN Eem305
100% (1)
CNN Eem305
7 pages
Configuring A Build Pipeline On Azure DevOps For An ASP - Net Core API - CodeProject
No ratings yet
Configuring A Build Pipeline On Azure DevOps For An ASP - Net Core API - CodeProject
18 pages
Unit Iv DL
No ratings yet
Unit Iv DL
26 pages
CNN, RNN
No ratings yet
CNN, RNN
60 pages
A Beginner's Guide To Understanding Convolutional Neural Networks Part 1 - Adit Deshpande - CS Under
100% (1)
A Beginner's Guide To Understanding Convolutional Neural Networks Part 1 - Adit Deshpande - CS Under
14 pages
Max78000 Article Series Part 1
No ratings yet
Max78000 Article Series Part 1
4 pages
Partiiunit6types of Neural Neywork
No ratings yet
Partiiunit6types of Neural Neywork
8 pages
Lecture 02 - Introduction To Neural Networks (Optional)
No ratings yet
Lecture 02 - Introduction To Neural Networks (Optional)
28 pages
CNN Unit
No ratings yet
CNN Unit
52 pages
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
No ratings yet
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
26 pages
Unit 2
No ratings yet
Unit 2
28 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
2 Deep Learning in Image Classification A Survey Report
No ratings yet
2 Deep Learning in Image Classification A Survey Report
4 pages
CNN Students
No ratings yet
CNN Students
170 pages
Convolutional Neural Network CNN With Practical Implementation by Amir Ali Wavy Ai Research Foundation Medium
No ratings yet
Convolutional Neural Network CNN With Practical Implementation by Amir Ali Wavy Ai Research Foundation Medium
27 pages
Chapitre 8 2024
No ratings yet
Chapitre 8 2024
231 pages
ANN Review
No ratings yet
ANN Review
5 pages
Physucs prjct-1
No ratings yet
Physucs prjct-1
33 pages
DL Unit3 1
No ratings yet
DL Unit3 1
67 pages
A Review On Artificial Neural Networks
No ratings yet
A Review On Artificial Neural Networks
5 pages
Module2 1
No ratings yet
Module2 1
27 pages
AI and Neural Networks
No ratings yet
AI and Neural Networks
5 pages
DL Notes-Merged
No ratings yet
DL Notes-Merged
13 pages
DL Unit-II
No ratings yet
DL Unit-II
40 pages
Unit - 2
No ratings yet
Unit - 2
31 pages
Neural Networks in Artificial Intelligence
No ratings yet
Neural Networks in Artificial Intelligence
2 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
27 pages
UNIT-2 DL
No ratings yet
UNIT-2 DL
51 pages
An Introduction To Convolutional Neural Networks: November 2015
No ratings yet
An Introduction To Convolutional Neural Networks: November 2015
12 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
CHAP 5 Data Science
No ratings yet
CHAP 5 Data Science
10 pages
Deep Learnig-CNN-new - DMI-compressed
No ratings yet
Deep Learnig-CNN-new - DMI-compressed
118 pages
Unit 1 GEN AI
No ratings yet
Unit 1 GEN AI
61 pages
Artificial Neural Network Concepts and Examples
No ratings yet
Artificial Neural Network Concepts and Examples
61 pages
DL Unit 4 Modified
No ratings yet
DL Unit 4 Modified
64 pages
DL Unit 4
No ratings yet
DL Unit 4
58 pages
Unit 3
No ratings yet
Unit 3
105 pages
Btech CSE
100% (1)
Btech CSE
17 pages
Artificial Neural Network and Its Types
No ratings yet
Artificial Neural Network and Its Types
17 pages
A Survey On Computer Vision Algorithms
No ratings yet
A Survey On Computer Vision Algorithms
16 pages
Neural Networks, A Brief Overview
No ratings yet
Neural Networks, A Brief Overview
2 pages
Artficial Neural Networks-Combined
No ratings yet
Artficial Neural Networks-Combined
32 pages
A Convolutional Neural Network
No ratings yet
A Convolutional Neural Network
23 pages
Unit 4
No ratings yet
Unit 4
27 pages
Neural Networks
No ratings yet
Neural Networks
7 pages
About Predicting The Future - Elements of AI
No ratings yet
About Predicting The Future - Elements of AI
12 pages
Amadeus Schedule Changes ASC
No ratings yet
Amadeus Schedule Changes ASC
12 pages
Names - Pronunciation
No ratings yet
Names - Pronunciation
25 pages
Travelport Rapid Reprice Worldspan Agent User Guide v1.
No ratings yet
Travelport Rapid Reprice Worldspan Agent User Guide v1.
70 pages
Soft Skills: Telephone Etiquette
100% (1)
Soft Skills: Telephone Etiquette
19 pages
AMADEUS Quick Card For Transavia Bookings
No ratings yet
AMADEUS Quick Card For Transavia Bookings
2 pages
Introduction To Hotel Industry
No ratings yet
Introduction To Hotel Industry
2 pages
Phonemic Chart
No ratings yet
Phonemic Chart
2 pages
Sciencedirect: © 2017, Ifac (International Federation of Automatic Control) Hosting by Elsevier Ltd. All Rights Reserved
No ratings yet
Sciencedirect: © 2017, Ifac (International Federation of Automatic Control) Hosting by Elsevier Ltd. All Rights Reserved
6 pages
Transfer Function Thermometer
No ratings yet
Transfer Function Thermometer
46 pages
Relational Calculus
No ratings yet
Relational Calculus
10 pages
Digital Control and PID Control of Industrial Processes Assignment
No ratings yet
Digital Control and PID Control of Industrial Processes Assignment
18 pages
BRIR生成模型
No ratings yet
BRIR生成模型
5 pages
Thermodynamic Potentials Explained
No ratings yet
Thermodynamic Potentials Explained
16 pages
Djikstra & Henseler 2015 Consistent PLS
No ratings yet
Djikstra & Henseler 2015 Consistent PLS
14 pages
Solution Manual For Applied Partial Differential Equations With Fourier Series and Boundary Value Problems, 5/E Richard Haberman
100% (18)
Solution Manual For Applied Partial Differential Equations With Fourier Series and Boundary Value Problems, 5/E Richard Haberman
42 pages
An Introduction To Signal Detection and Estimation - Second Edition
No ratings yet
An Introduction To Signal Detection and Estimation - Second Edition
9 pages
Image Inpainting For Irregular Holes Using Partial Convolutions
No ratings yet
Image Inpainting For Irregular Holes Using Partial Convolutions
23 pages
Lesson: Advanced Analytics With SAP HANA: Graph Modeling
No ratings yet
Lesson: Advanced Analytics With SAP HANA: Graph Modeling
1 page
RNTN PDF
No ratings yet
RNTN PDF
12 pages
Python for Machine Learning Enthusiasts
No ratings yet
Python for Machine Learning Enthusiasts
50 pages
Mathematica PDF
100% (1)
Mathematica PDF
3 pages
Cap653 - Artificial Intelligence PDF
No ratings yet
Cap653 - Artificial Intelligence PDF
10 pages
A Predictive PI Controller For Processes With Long Dead Time
No ratings yet
A Predictive PI Controller For Processes With Long Dead Time
16 pages
Thermodynamics PDF
No ratings yet
Thermodynamics PDF
24 pages
Structural Analysis for Engineers
No ratings yet
Structural Analysis for Engineers
2 pages
IMP-week4 Intro Fuzzy Logic
No ratings yet
IMP-week4 Intro Fuzzy Logic
26 pages
Procedural Road Generation
No ratings yet
Procedural Road Generation
11 pages
ECE 420: Embedded DSP Laboratory Lab Assigned Project Lab Eigenfaces For Recognition Paper Summary
No ratings yet
ECE 420: Embedded DSP Laboratory Lab Assigned Project Lab Eigenfaces For Recognition Paper Summary
5 pages
(To Do by 29 Mar) Calculus I - Product and Quotient Rule (Practice Problems)
No ratings yet
(To Do by 29 Mar) Calculus I - Product and Quotient Rule (Practice Problems)
1 page
Mat202 July 2021
No ratings yet
Mat202 July 2021
3 pages
DAA Question Bank
No ratings yet
DAA Question Bank
5 pages
5ETB0 Exam 2022-2023 Questions and Answers
No ratings yet
5ETB0 Exam 2022-2023 Questions and Answers
8 pages
Household Finance Notes
No ratings yet
Household Finance Notes
5 pages
Unit 2
No ratings yet
Unit 2
3 pages
TST-GAN A Legal Document Generation Model Based On Text Style Transfer
No ratings yet
TST-GAN A Legal Document Generation Model Based On Text Style Transfer
4 pages
MTS 3013 Structured Programming
No ratings yet
MTS 3013 Structured Programming
28 pages
AI Mid Term Exam Sample Paper
100% (1)
AI Mid Term Exam Sample Paper
2 pages