[Lab6-1&6-2] Softmax & Fancy softmax classification 구현

Notice

Recent Posts

Recent Comments

Link

메쭈

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

케이스윔의 개발 블로그

[Lab6-1&6-2] Softmax & Fancy softmax classification 구현 본문

모두를 위한 딥러닝

[Lab6-1&6-2] Softmax & Fancy softmax classification 구현

kswim 2018. 5. 14. 12:53

실생활에서는 2개 중 고르는 것보다 여러개 중 하나를 고르는 경우가 많다! 그럴때 softmax를 사용한당

Logistic classifier에서 나온 값은 scores에 불과하고 이것을 softmax 함수를 통해 확률로 나오도록 해야한다! 이 확률의 합은 1이다.

행렬의 곱인 XW를 tensorflow를 이용한다면 tf.matmul(X, W)+b와 같이 표현할 수 있다.

softmax function은? hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)

어떤 레이블이 될 것인지에 대한 확률로 값이 나올 것이다.

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

<Lab6-1>

import tensorflow as tf

x_data = [[1, 2, 1, 1],

[2, 1, 3, 2],

[3, 1, 3, 4],

[4, 1, 5, 5],

[1, 7, 5, 5],

[1, 2, 5, 6],

[1, 6, 6, 6],

[1, 7, 7, 7]]

y_data = [[0, 0, 1],

[0, 0, 1],

[0, 1, 0],

[1, 0, 0],

[1, 0, 0]]

#y_data처럼 세가지 중 하나만 1을 만들기 위해 one-hot encoding 사용

X = tf.placeholder("float", [None, 4])

Y = tf.placeholder("float", [None, 3])

#y의 개수가 label의 개수=class의 수!

nb_classes = 3

W = tf.Variable(tf.random_normal([4,nb_classes]), name='weight')

b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)

cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(hypothesis), axis = 1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for step in range(2001):

sess.run(optimizer, feed_dict={X:x_data, Y:y_data})

if step % 200 == 0:

print(step, sess.run(cost, feed_dict={X:x_data, Y:y_data}))

a = sess.run(hypothesis, feed_dict={X:[[1,11,7, 9]]})

print(a, sess.run(tf.arg_max(a, 1)))

#예측한 vector인 a를 arg_max 이용을 해서 가장 근접한 class를 출력함!

*arg_max(a, 1)에서는 a의 각 행 중에서 가장 큰 값을 고르는 건데 만약 두번째 파라매터를 0으로 줄 경우 각 열에서 가장 큰 값을 고르고, 2로 하면 각 면(행과 열 함께)에서 가장 큰 값을 고른다!

<softmax_cross_entropy_with_logits을 사용해서 fancy softmax classification>

스코어 형태를 logits이라고 부름!

logits = tf.matmul(X, W)+b

hypothesis = tf.nn.softmax(logits)

#Cross entropy cost/loss

cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(hypothesis), axis=1))

->여기서 Y는 one hot!

새로운 함수로 표현해보자면

cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot)

cost = tf.reduce_mean(cost_i)

tf.one_hot과 reshape

Y = tf.placeholder(tf.int32, [None, 1])

#위의 Y에서 None는 n개의 값이 나옴을 의미, 1은 n열에서 한 행은 한 값을 가진다는 것!

Y_one_hot = tf.one_hot(Y, nb_classes)

#one_hot을 하게되면 N차원의 값을 주면 N+1차원으로 만들어버린다! [[0],[3]]->[[[1000000]], [[0001000]]] 와 같이 표현됨

Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])

헐 ! 여기서 갑자기 왜 Y_one_hot이라는 변수를 만드는지 이해가 안갔는데 만약 0~6의 class중 하나를 고를 때 one_hot의 형태를 원한다는 것은 숫자 0~6을 고르는 것이 아닌 0001000와 같이 3을 나타내는 vector의 onehot으로 결과값을 내야하기때문에!

import tensorflow as tf

import numpy as np

tf.set_random_seed(777) # for reproducibility

xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)

x_data = xy[:, 0:-1]

y_data = xy[:, [-1]]

nb_classes = 7

#총 7개의 class가 존재한다!

X = tf.placeholder(tf.float32, [None, 16])

#특징을 나타내는 변수 X가 16개 존재해서 16으로!

Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6

Y_one_hot = tf.one_hot(Y, nb_classes)

Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])

#원하는 형태로 reshape해준다

W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight')

b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

logits = tf.matmul(X, W) + b

hypothesis = tf.nn.softmax(logits)

# Cross entropy cost/loss

cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,

labels=Y_one_hot)

cost = tf.reduce_mean(cost_i)

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

prediction = tf.argmax(hypothesis, 1)

#확률을 0~6 사이의 값으로 만들어 준다.

correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for step in range(2000):

sess.run(optimizer, feed_dict={X: x_data, Y: y_data})

if step % 100 == 0:

loss, acc = sess.run([cost, accuracy], feed_dict={

X: x_data, Y: y_data})

print("Step: {:5}\tLoss: {:.3f}\tAcc: {:.2%}".format(

step, loss, acc))

pred = sess.run(prediction, feed_dict={X: x_data})

#학습을 통해 x_data로 예측한 값을 pred에 넣고 아래에서 실제값과 비교해서 본다.

#[[1],[0]]->[ 1, 0] 으로 만들어 주는 것이 faltten()

for p, y in zip(pred, y_data.flatten()):

print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)))

이제 점점 너무 어렵다. Lec4까지는 한번 들으면 이해가 갔는데 이제 듣고나서 바로바로 메모 안하고 다시 보면 기억도 잘안난다!

다시 하루에 하나씩이라도 들으면서 공부해야겠다ㅠㅠ

'모두를 위한 딥러닝' 카테고리의 다른 글

[Lec6-1&6-2] Softmax Regression 개념과 cost function (0)	2018.05.07
[Lab05] Logistic classification 구현하기 (0)	2018.05.07
[Lec5-1&5-2] Logistic Classification의 가설 함수 정의와 cost 함수 설명 (0)	2018.05.06
[Lab4-1&4-2] Multi-variable regression 및 Loading Data from file (0)	2018.04.30
[Lec04] Multi-variable Linear regression 이란? (0)	2018.04.29

'모두를 위한 딥러닝' Related Articles

Comments

케이스윔의 개발 블로그

[Lab6-1&6-2] Softmax & Fancy softmax classification 구현 본문

[Lab6-1&6-2] Softmax & Fancy softmax classification 구현

'모두를 위한 딥러닝' 카테고리의 다른 글

티스토리툴바