[딥러닝초급] A Single Neuron

로봇-AI

by happynaraepapa 2025. 2. 3. 16:09

sources :
https://www.kaggle.com/code/ryanholbrook/a-single-neuron

...
Welcome to Kaggle's Introduction to Deep Learning course! You're about to learn all you need to get started building your own deep neural networks. Using Keras and Tensorflow
#딥러닝 기술의 기초내용을 배울 예정이다.
#사용하는 파이썬 라이브러리는 Keras와 Tensorflow
#안타까운일이지만 아마 대부분의 학습자들이 이 딥러닝이나 머신러닝의 수학적 의미와 알고리즘을 모두 이해하고 사용하고 있지는 않고 사용방법을 중점적으로 익혀서 활용(?)하고 있다고 생각한다. 각종 학습지나 자격증 시험 조차 코딩을 외워서 풀도록 유도하고 있기 때문에 상당히 안타깝다고 생각한다. 왜냐하면 몇년 후면 그 모든 것은 무용지물이기 때문이다.
#여러분이 이 케글에 나오는 코딩을 외울 필요는 없다고 생각한다. 어차피 꾸준히 하지 않는다면 아마 한달이 채 지나기 전에 까마득히 잊어먹을테니.
#여기서는 코딩을 읽는다(?) 이해한다(?) 수준이면 성공이고, 그 안에 있는 알고리즘을 좀 더 배우고 싶다면 별도의 검색과 리서칭을 해야할 것이다.

you'll learn how to:
배울내용은
create a fully-connected neural network architecture
완전결속 뉴럴 네트워크 구조를 만든다.

apply neural nets to two classic ML problems: regression and classification
뉴럴네트워크를 이용해서 고전적인 머신러닝 문제를 푼다. --> 회귀와 분류

train neural nets with stochastic gradient descent, and
뉴럴네트워크를 이용해서 stochastic gradient descent를 학습시킨다.
#stochastic 확률론적
#gradient descent 경사하강법 [알고리즘]
: 함수의 미분이 기울기를 나타내며 이 기울기가 0이 되는 지점이 최소값임을 이용하여 최소값을 찾는 방법이고 점차 기울기가 줄어드는 방향으로 하강하도록 스텝을 밟기 때문에 경사하강법이라고 부르는 것.

improve performance with dropout, batch normalization, and other techniques
드랍아웃, 배치평준화 등등의 기법에서 성능을 개선한다.

The tutorials will introduce you to these topics with fully-worked examples, and then in the exercises, you'll explore these topics in more depth and apply them to real-world datasets.
여기서는 실제 예제를 통해 내용을 공부하고 연습문제에서 실제 데이터셋에 내용을 적용해 볼 것.

Let's get started!

What is Deep Learning?
딥러닝은 무엇인가?

Some of the most impressive advances in artificial intelligence in recent years have been in the field of deep learning. Natural language translation, image recognition, and game playing are all tasks where deep learning models have neared or even exceeded human-level performance.
딥러닝 분야는 최근까지도 인공지능 기술에 괄목할만한 기술 성장을 견인해왔다. 자연어 번역, 이미지 인식, 게임 플레이 등 다양한 분야에서 이미 인간에 가깝거나 또는 인간을 능가하는 성능을 보여주고 있다.

So what is deep learning? Deep learning is an approach to machine learning characterized by deep stacks of computations. This depth of computation is what has enabled deep learning models to disentangle the kinds of complex and hierarchical patterns found in the most challenging real-world datasets.
딥러닝은 무엇인가? 딥러닝은 머신러닝을 수행함에 있어 복잡한 연산계층(deep stacks of computation)을 통해 머신 러닝에 접근하는 방법이다. 이 복잡한 연산계층을 통해 딥러닝은 현실 데이터셋으로 부터 복잡하고 고도화된 패턴을 해석하고 풀어낼 수 있다.

Through their power and scalability neural networks have become the defining model of deep learning. Neural networks are composed of neurons, where each neuron individually performs only a simple computation. The power of a neural network comes instead from the complexity of the connections these neurons can form.
뉴럴 네트워크를 이용해서 얼마나 복잡한 고도의 연산 계층을 정의하는가에 따라서 해당 모델을 딥러닝 모델이라고 부르기 시작했다. (즉 한개의 신경망 모델에서 더욱 복잡한 계층을 Deep 하게 만들기 시작하니까 이를 구분해서 딥러닝이라고 부른것)

뉴럴네트워크(신경망모델)는 뉴런으로 만들어져 있다. 개별적인 뉴런은 각각 아주 단순한 연산을 수행하고 신경망 모델은 이 뉴런을 여러개 연결하여 복잡한 연산 네트워크 모델을 구성한 것이다.

The Linear Unit
선형유닛

So let's begin with the fundamental component of a neural network: the individual neuron. As a diagram, a neuron (or unit) with one input looks like:
자 그럼 기초적인 뉴럴 네트워크에 대해 공부하기 위해 단일 뉴런을 공부해 보자.
아래 다이어그램은 단일 입력 기준 개별 뉴런을 보여준다.
(그림생략)

y=wx+b

The input is x. Its connection to the neuron has a weight which is w. Whenever a value flows through a connection, you multiply the value by the connection's weight. For the input x, what reaches the neuron is w * x. A neural network "learns" by modifying its weights.
입력x에 대해서 w는 가중치(weight)이고 어떤 입력값을 받을때마다 뉴런은 w*x를 계산하게 된다.

The b is a special kind of weight we call the bias. The bias doesn't have any input data associated with it; instead, we put a 1 in the diagram so that the value that reaches the neuron is just b (since 1 * b = b). The bias enables the neuron to modify the output independently of its inputs.
b값은 우리가 바이어스(bias; 편차)라고 부르는 값으로 입력값에 상관없이 작용한다. 여기서는 입력값 1에 가중치  b를 줘서 1*b가 되었다고 설명하고 있다.
이 바이어스는 가중치에 상관없이 출력값(output)을 조절할 수 있게 해준다.
#1차함수로 본다면 위의 bias와 w는 각각 해석기하학 그래프의 절편(intercept)과 기울기(slope)에 해당한다.
#intercept bias를 회귀분석에서의 관측 편향이라고 한다.

The y is the value the neuron ultimately outputs. To get the output, the neuron sums up all the values it receives through its connections.
y값은 이 뉴런의 출력값이며, 네트워크에서 받아들인 모든 뉴런 출력값을 합산하여 네트워크 결과값이 나온다.

...(중간생략)...

Example - The Linear Unit as a Model
Though individual neurons will usually only function as part of a larger network, it's often useful to start with a single neuron model as a baseline. Single neuron models are linear models.
단일선형모델인 개별 뉴런들은 더 큰 네트워크의 일부로 사용되는 것이 일반적이다. 하지만 때로는 이 단일 선형모델을 시작점으로 전체 모델을 이해하는 것이 도움이 되기도 한다.

Let's think about how this might work on a dataset like 80 Cereals. Training a model with 'sugars' (grams of sugars per serving) as input and 'calories' (calories per serving) as output, we might find the bias is b=90 and the weight is w=2.5. We could estimate the calorie content of a cereal with 5 grams of sugar per serving like this:
이 모델이 어떻게 동작하는지 알기 위해 80 Cereals라는 실제 데이터셋 사례를 들어 설명해보자.
데이터셋에서 sugars(한끼당 설탕함량), calories(한끼당 칼로리량)을 각각 인풋과 아웃풋으로 하는 모델을 트레이닝시킨 뒤, 바이어스 b=90과 가중치 w=2.5를 얻었다고 생각해보자.

이제 우리는 한끼당 5그램의 설탕함량을 갖는 시리얼의 칼로리 함량을 다음과 같이  예측할 수 있다.
calories = 2.5  x 5 + 90 = 102.5
...(그림생략)...

Multiple Inputs
다중 입력
The 80 Cereals dataset has many more features than just 'sugars'. What if we wanted to expand our model to include things like fiber or protein content? That's easy enough. We can just add more input connections to the neuron, one for each additional feature. To find the output, we would multiply each input to its connection weight and then add them all together.
사실 위의 80 Cereals 데이터셋은 sugars 외에도 다양한 피쳐(Feature 컬럼)를 가지고 있다.
만약 우리가 위의 모델을 좀 더 발전시켜서 식이섬유(fiber)나 단백질(protein)함량등을 입력값으로 하는 모델로 만들려면 어떻게 해야 될까?
우리는 단순히 더 많은 입력값을 뉴런에 연결하면 된다. 각 feature 마다 하나씩 연결값을 만들고 각각에 가중치값을 곱하고 더하면 된다.
y = w0x*+w1x1+w2x2+b
...(그림 생략)...

. A linear unit with two inputs will fit a plane, and a unit with more inputs than that will fit a hyperplane.
두개의 입력값을 갖는 선형모델은 벡터평면을 피팅하게될 것이고, 더 많은 입력값을 갖게 되면 초평면 (Hyperplane)을 피팅하게 된다.

Linear Units in Keras
케라스의 선형유닛
The easiest way to create a model in Keras is through keras.Sequential, which creates a neural network as a stack of layers. We can create models like those above using a dense layer (which we'll learn more about in the next lesson).
Keras에서 모델을 생성하는 가장 쉬운 방법은 keras를 이용하는 것이다. keras.Sequential 함수는 여러 레이어계층의 신경망을 생성하고 이 밀집된 레이어를 이용해서 모델을 생성할 수 있다.
#대문자로 Keras라고 하면 전체 프레임워크를 의미하고, 소문자로 시작하는 keras는 거기에 속한 일부 라이브러리.

We could define a linear model accepting three input features ('sugars', 'fiber', and 'protein') and producing a single output ('calories') like so:

위에서 언급했던 80 Cereals 데이터셋에서 우리는 sugars, fiber, protein을 input feature로 갖고 calories를 ouput으로 갖는 모델을 생성하려고 한다.

#<아래 코드>
from tensorflow import keras
from tensorflow.keras import layers

# Create a network with 1 linear unit
model = keras.Sequential([
    layers.Dense(units=1, input_shape=[3])
])

With the first argument, units, we define how many outputs we want. In this case we are just predicting 'calories', so we'll use units=1.
keras.Sequential 함수에서 첫 인자로 units=1을 썼는데 이것은 Output 이 몇개인지 알려준다. 여기서 우리는 calories 한 개 feature만 output으로 사용하므로 units=1.

With the second argument, input_shape, we tell Keras the dimensions of the inputs. Setting input_shape=[3] ensures the model will accept three features as input ('sugars', 'fiber', and 'protein').
두번째에서는 예상하다시피 input feature 수를 의미한다. 입력 차원(dimensions  of input)이라고 한 이유는 1차원이면 직선, 2차원이면 평면, 3차원이면 공간... 이상이면 초공간.. 이렇게 확장되기 때문에 차원개념으로 부르기도 한다.여기서는 sugars, fiber, protein의  3개 feature를 사용하고 input_shape에 대해 3차원임을 표기했다.

..(생략)..

Why is input_shape a Python list?

The data we'll use in this course will be tabular data, like in a Pandas dataframe. We'll have one input for each feature in the dataset. The features are arranged by column, so we'll always have input_shape=[num_columns]. The reason Keras uses a list here is to permit use of more complex datasets. Image data, for instance, might need three dimensions: [height, width, channels].
이 코스에서 우리가 사용하는 데이터셋은 Pandas 데이터 프레임과 같이 테이블로 구성된 데이터다. 피쳐들은 컬럼으로 구성되고 따라서 항상 input_shape = [컬럼수]라고 생각할 수 있다. 다만 케라스는 여기서 이미지 데이터와 같은 좀 더 복잡한 입력데이터를 허용하기 위해서 리스트 형식을 취하고 있다고 보면 된다. 예를 들어 이미지 데이터라면 적어도 3개의 디멘젼이 필요할 거다. [height, width, channels].

<실습하세요.>

'로봇-AI' 카테고리의 다른 글

[딥러닝기초] Stochastic Gradient Descent (0)	2025.02.05
[딥러닝초급]Deep Neural Networks (0)	2025.02.04
[머신러닝중급]Cross-Validation + XGBoost (0)	2025.01.23
[머신러닝중급] Pipelines (0)	2025.01.23
[머신러닝중급]Categorical Variables (0)	2025.01.22