A Book from the Sky 天书

天書

Exploring the Latent Space of Chinese Handwriting

制造手寫的漢字

讀中文摘要

讀英文技術說明

By @genekogan, 2015 December 15.

@genekogan,2015年12月15日

These images were created by a deep convolutional generative adversarial network (DCGAN) trained on a database of handwritten Chinese characters, made with code by Alec Radford based on the paper by Radford, Luke Metz, and Soumith Chintala in November 2015.

這些圖像是由於 deep convolutional generative adversarial network (DCGAN) 創造的,訓練在手寫漢字的語料庫上,使用Alec Radford製作的軟件,而基於Radford,Luke MetzSoumith Chintala2015年十一月所出版的文章。

The title is a reference to the 1988 book by Xu Bing, who composed thousands of fictitious glyphs in the style of traditional Mandarin prints of the Song and Ming dynasties.

標題是引用徐冰1988出版的《天書》。在此本,藝術家刻製兩千餘模仿宋明朝的“偽漢字”。

A DCGAN is a type of convolutional neural network which is capable of learning an abstract representation of a collection of images. It achieves this via competition between a "generator" which fabricates fake images and a "discriminator" which tries to discern if the generator's images are authentic (more details). After training, the generator can be used to convincingly generate samples reminiscent of the originals.

Below, a DCGAN is trained on a labeled subset of ~1M handwritten simplified Chinese characters, after which the generator is able to produce fake images of characters not found in the original dataset.

DCGAN (捲積人工神經網絡) 是一種人工神經網絡。DCGAN能分析語料庫原文的真實漢字的圖像而能製造模仿偽漢字,如下。


is

in

not

country

on

have

large

for

year

with

out

minute

city

learn

come

day

can

reason
real 真实
generated  制造的 

Exploring the latent space 探索潛在的空間

The generator is parameterized by a vector within a high-dimensional latent space, allowing us to peer into its imagination. By traversing this space, we can explore the generator's perception of possible characters.

我們可以探索人工神經網絡製造軟件所幻想的偽漢字。


 have

 learn

 sky

 child

 is

 strength

 to

 from

 air

 do

 time

 come

 body

 not

 control

 large

Reading between the lines 筆畫之間

Rather than simply exploring the neighborhood around individual characters, we can span the latent space between characters as well. By producing samples along a straight line from one character to the next, we get an impression of imaginary characters which are interpolated from in between real ones, perhaps corresponding to semantically intermediate concepts.

在兩個真實漢字的差距空間之中,軟件能製造偽漢字。


 eye
 face
 body

 people
 culture

 recognize
 remember
 learn

 city
 capital
 country
 world

 year
 month
 week
 day

 air
 ground
 water

 open
 close

 they
 she
 he

 reason
 say

Radical interpolation 部首內插 (插值)

Chinese characters are comprised of radicals, which are graphical components that serve as the most basic semantic grouping of characters and usually hint at the character's meaning. For instance, the characters 们 (they), 仔 (youngster) and 以 (with, using) all contain the radical 人 (person), appearing as 亻 in the first two. One of the most striking results was the preservation of radicals across character interpolations. For example, to the lower left is an interpolation through characters 后 (after), 台 (platform), and 名 (name), which share the radical 口 (mouth), highlighted in red. To the lower right, we traverse a sequence of characters sharing the 人 radical. Remarkably, the 人 appears coherent during the transitions, even as it glides into different forms and positions!

左下是擁有“口”部首的“后”、“台”、“名”三個字的內插 (interpolation)。 右下是擁有“人”部首的幾個字的系列。在不同的字的轉變,能看得很清楚“人”字的連續性。


Radical 30: 口 (mouth)
 after
 platform
 name

Radical 9: 人 (person)
 person
 from
 meeting
 now
 still
 office
 item

Linguistic algebra 文字代數

An active area of research in computational linguistics is deriving geometric representations of words whose spatial interrelationships closely correlate with their semantic ones. These "word vectors" can be expressed by equations such as <king> − <man> + <woman> = <queen>, despite having had no prior knowledge of these words' meanings.

Since the DCGAN's ability to interpolate is just a special case of its more general capacity for combining character classes arithmetically, we can attempt to determine if the above analogy and others also underpin our writing systems. Do the following equations match our visual expectations?

DCGAN雖然不知道漢字圖像的意義,但還能按照數學邏輯執行運算。



king


male
+


female
=

Since "queen" is usually not expressed as a single character, we can't compare.

Below is a matrix of interpolation loops between every pair among the 20 most frequent characters.

之下有二十個最普通的漢字之中的內插 (又稱插值, interpolations)

is
at
in
not
(past tense)
country
on
have
large
for
year
this
(individual)
out
time
minute
people
city
do
to

The software, data, trained model, and code are preserved in this Terminal snap in which the next command will produce a new version of each image on this page. An extended version of this page with more images can be seen here.

此研究計畫主要是公開進行的,而使用了很多網上的意見。軟件代碼在此連結

Thanks to Nick Frisch, Francis Tseng, Tom White, and Cheng-Lin Liu for their contributions, suggestions, and advice.

感謝 Nick Frisch, Francis Tseng, Tom White, Cheng-Lin Liu的貢獻與建議。