Advertisement
Guest User

Untitled

a guest
Oct 28th, 2015
428
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 44.64 KB | None | 0 0
  1. (17:42:54) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  2. (17:42:54) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  3. (18:24:40) apt-get_ [~apt-get@D97330BC.55ADF584.217FDA43.IP] entered the room.
  4. (18:55:20) kiradess: hating lua right now
  5. (18:57:21) kiradess: just all around useless as a general scripting language
  6. (19:08:33) emeraldgreen: it's less bloated than angelscript
  7. (19:08:41) emeraldgreen: and fast
  8. (19:15:44) kiradess: not saying it's bad for game engine stuff
  9. (19:16:00) kiradess: but fuck if it doesn't have 2% the functionality of python
  10. (19:16:10) kiradess: feels like i'm writing C
  11. (19:16:29) emeraldgreen: it's also used for cutting edge deep learning (torch), outcompeting python in that space
  12. (19:16:39) kiradess: I mean for general stuff, I was fucking around with rewriting a bunch of scripts in it for practice
  13. (19:17:12) kiradess: and it lacked some really basic math and string stuff, it was weird
  14. (19:17:52) emeraldgreen: hmm, maybe there are libs for that
  15. (19:18:14) emeraldgreen: I stopped caring about languages because I write JS, lua and python sometimes
  16. (19:21:44) kiradess: same, lua wouldnt have even been on my radar if it wasn't for urho3d
  17. (19:23:01) emeraldgreen: I wrote a little lua when I modded S.T.A.L.K.E.R., also I read its sources when I tried to create my own small dynamic language
  18. (19:24:11) kiradess: cool
  19. (19:24:54) emeraldgreen: it'd be cool if Urho had built-in node.js
  20. (19:25:13) emeraldgreen: +performance +popularity
  21. (19:30:57) kiradess: dont think node.js has 97% the speed of C
  22. (19:31:25) emeraldgreen: it comes close at factor 2-3
  23. (19:31:44) emeraldgreen: V8 is the fastest dynamic language compiler in the world
  24. (19:32:20) emeraldgreen: luajit used to hold that title but now it's 2x slower than V8
  25. (19:32:55) kiradess: cool
  26. (21:10:26) apt-get_: create a new JIT compiler that has 95% the speed of C in python
  27. (21:10:26) apt-get_: revolutionize the programming world
  28. (21:10:26) apt-get_: for waifus
  29. (21:53:46) kiradess: nah
  30. (22:08:28) emeraldgreen: apt-get_ there are 3 (!) different JIT compilers for python
  31. (22:16:58) apt-get_: are there emeraldgreen
  32. (22:17:01) apt-get_ is now known as apt-get
  33. (22:17:05) apt-get: huh
  34. (22:17:22) apt-get: I know of pypy but I thought it was the only one
  35. (22:18:00) emeraldgreen: pypy, pyston and numba
  36. (27.10.2015 00:55:40) apt-get left the room (quit: Remote host closed the connection).
  37. (01:32:07) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
  38. (03:21:25) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  39. (11:08:51) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  40. (11:08:51) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  41. (12:14:35) kiradess: I was thinking about speech synthesis recently
  42. (12:14:45) kiradess: most software for creating a voice model relies on a speaker recording themselves saying a series of set phrases
  43. (12:15:12) kiradess: some few hundred/few thousand words in total, covering all the common diphones/triphones
  44. (12:27:25) kiradess: anyway
  45. (12:27:56) kiradess: that really only works if you've got a voice people can stand to listen to, and a lot of time on your hands
  46. (12:28:44) kiradess: I was wondering if anyone has attempted to train a voice model on unlabeled data, by means of speech recognition
  47. (12:30:23) kiradess: instead of say.. hundreds or thousands of carefully labeled and sliced words, using something like hundreds of HOURS of someone's podcasts or other source of speech without background noise
  48. (12:30:48) kiradess: take the slowest, most accurate speech parser, run it over some audio, keep only the sections it recognized with high confidence
  49. (12:31:35) kiradess: repeat until you have enough samples for one of the voice modelers out there like festvox
  50. (12:38:37) kiradess: I know it's done for recognition
  51. (12:39:11) kiradess: that is, train a recognition model on a large amount of unlabeled audio, somehow
  52. (12:39:49) kiradess: I just don't know if anyone has taken the identified, and now labeled, audio segments from that process and fed them into a voice modeler
  53. (12:46:51) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  54. (12:47:13) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  55. (12:47:13) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  56. (12:49:26) emeraldgreen left the room (quit: Ping timeout: 240 seconds).
  57. (12:51:12) kiradess: either I'm not searching with the right terms or it's a shit idea and no has tried it
  58. (13:27:04) kiradess: not thrilled to be hung up on 3d stuff when i'd really like to dive into speech and NLP stuff
  59. (13:27:32) kiradess: python has some great libraries for that i've been meaning to use, like NLTK
  60. (13:47:45) kiradess: http://lands.let.ru.nl/cgn/publs/2002_06.pdf this
  61. (13:48:40) kiradess: >Spoken Dutch Corpus project
  62. (13:48:41) kiradess: kek
  63. (13:51:14) kiradess: >For the speech technologist segmentations are indispensable for the initial training of acoustic ASR models, the development of TTS systems and speech research in general.
  64. (13:55:14) kiradess: I think it's a good idea, especially since error rate is not that important
  65. (13:56:05) kiradess: a lot of what I've been reading is people using automatic speech recognition to attempt to segment and label large bodies of raw audio
  66. (13:56:21) kiradess: but their focus has been on reducing the error rate as much as possible
  67. (13:56:44) kiradess: in my use, any ambiguous utterances can just be discarded
  68. (13:57:54) kiradess: you might have a spoken sentence of 12 words, of which 8 were recognized with 99.7%+ confidence or something, from which you obtain 5 new diphones for your voice model
  69. (13:58:14) kiradess: is how I imagine it playing out
  70. (14:03:53) kiradess: >However, the confidence intervals may still be useful for other applications such as TTS systems for which the segments with reliable boundary positions can be selected automatically.
  71. (15:35:37) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
  72. (16:05:38) apt-get_ [~apt-get@E941CCAA.AE17DAB.A8BCC5AB.IP] entered the room.
  73. (16:08:51) apt-get left the room (quit: Ping timeout: 240 seconds).
  74. (16:12:11) apt-get__ [~apt-get@7D5356E7.DE55F473.BEE40085.IP] entered the room.
  75. (16:15:20) apt-get_ left the room (quit: Ping timeout: 240 seconds).
  76. (16:25:56) apt-get__ is now known as apt-get
  77. (19:22:48) emeraldgreen1: kiradess You have a good idea about speech synthesis
  78. (19:24:35) emeraldgreen1: *bought a spare used PC (c2d 2.4 ghz) + lcd monitor for 50$*
  79. (19:24:40) emeraldgreen1: feels good
  80. (19:28:09) kiradess: nice
  81. (19:29:15) kiradess: i have a pc sitting around i'm not using either, its an amd e-series apu
  82. (19:29:17) kiradess: dual core, 1.8ghz, sips power
  83. (19:29:28) kiradess: been meaning to make it a server for something but meh
  84. (19:29:43) emeraldgreen1: my main machine is 6-core amd
  85. (19:29:44) kiradess: oh, speech shit
  86. (19:29:58) kiradess: i got a pentium g3258
  87. (19:30:01) emeraldgreen1: I will probably use this new pc for work
  88. (19:30:16) kiradess: so speech, yeah I thought it was a good idea
  89. (19:30:26) kiradess: remains to be seen if it works in practice though
  90. (19:30:40) kiradess: need to read up on prior work
  91. (19:30:54) kiradess: automatic speech segmentation
  92. (19:30:59) emeraldgreen1: there is deep speech synthesis, but it hasn't outperformed traditional approaches yet
  93. (19:31:41) emeraldgreen1: e.g. http://research.microsoft.com/en-us/projects/dnntts/
  94. (19:32:20) kiradess: ill give it a look in a min
  95. (19:33:39) emeraldgreen1: >that really only works if you've got a voice people can stand to listen to, and a lot of time on your hands
  96. (19:33:39) emeraldgreen1: It should be possible to apply voicemorphing software to tts output, ofcourse some quality will be lost
  97. (19:34:40) emeraldgreen1: >ot thrilled to be hung up on 3d stuff when i'd really like to dive into speech and NLP stuff
  98. (19:34:40) emeraldgreen1: Then we need to make a good enough 3d and leave it at it
  99. (19:36:10) emeraldgreen1: >you might have a spoken sentence of 12 words, of which 8 were recognized with 99.7%+ confidence or something, from which you obtain 5 new diphones for your voice model
  100. (19:36:42) emeraldgreen1: It looks like a good idea, I thought about pirating lots of audiobooks and aligning them to corresponding texts
  101. (19:37:13) emeraldgreen1: training model is very heavy computation though
  102. (20:00:52) kiradess: ?
  103. (20:03:10) emeraldgreen1: you have to do a 1000 ops per byte of your dataset, probably more
  104. (20:03:28) kiradess: you mean the RNN way?
  105. (20:03:33) emeraldgreen1: To train on 100k hr dataset you will need a large cluster and a couple of weeks
  106. (20:03:52) kiradess: >100k hour dataset
  107. (20:03:57) kiradess: does such a thing even exist
  108. (20:04:11) emeraldgreen1: RNNs are very heavy, yes, but traditional HMM aren't lightwell either
  109. (20:04:18) emeraldgreen1: kiradess in baidu - yes
  110. (20:04:29) kiradess: ah baidu
  111. (20:04:32) emeraldgreen1: also with dataset augumentation (add noise, transform etc)
  112. (20:04:50) kiradess: I never really gave it much thought beyond finding an open source toolkit that would do the job
  113. (20:05:36) kiradess: there was a link I posted before, festival i think it was
  114. (20:05:53) kiradess: where some of the voices were trained on very little data and sounded decent
  115. (20:05:58) emeraldgreen1: https://github.com/edobashira/speech-language-processing there is this large list
  116. (20:07:17) kiradess: http://www.cstr.ed.ac.uk/projects/festival/morevoices.html
  117. (20:07:17) kiradess: yeah I saw that, good stuff on there
  118. (20:07:34) kiradess: Scottish male - Alan (ARCTIC), Jon (2hr)
  119. (20:07:34) kiradess: English RP male - Nick (8hr), Roger (13hr), Korin (TIMIT, ~20mins)
  120. (20:07:34) kiradess: English RP female - Nina (3hr)
  121. (20:07:40) kiradess: 2hrs? 3hrs? what?
  122. (20:08:20) emeraldgreen1: tts doesn't require as much data as the inverse problem
  123. (20:09:18) kiradess: I guess not
  124. (20:09:29) kiradess: still, more data than I have at the present
  125. (20:09:57) emeraldgreen1: we need special data - a female voice in various real life situations
  126. (20:10:21) emeraldgreen1: also there is this new gnu speech project
  127. (20:13:32) kiradess: http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav lol
  128. (20:14:10) kiradess: its really all about the voice samples at this point
  129. (20:14:31) kiradess: unless you're going for some strong AI, indistinguishable-from-human approach
  130. (20:14:46) kiradess: for instance, the latest voiceroid is quite good
  131. (20:15:04) kiradess: as in, very natural sounding, even compared to the couple before it
  132. (20:16:18) kiradess: I actually own a physical copy of yukari's
  133. (20:16:27) kiradess: cost like 150usd to import the shit, after not being able to find a working torrent for probably 3 months after release
  134. (20:16:42) kiradess: ofc I don't use it anymore since I don't have a windows install
  135. (20:17:13) kiradess: http://www.ah-soft.com/voiceroid/kotonoha/index.html
  136. (20:18:27) kiradess: I don't know where I would go about getting decent english voice audio but my idea for jap was drama cds and other seiyuu stuff
  137. (20:23:13) kiradess: http://www.nicovideo.jp/watch/sm23462858
  138. (20:40:16) apt-get left the room (quit: Ping timeout: 240 seconds).
  139. (20:40:21) apt-get_ [~apt-get@CE6E29B4.3D888B11.CAF2BD85.IP] entered the room.
  140. (22:19:35) kiradess: http://festvox.org/bsv/x794.html
  141. (22:20:24) emeraldgreen1: kiradess makes me want to dict the dataset myself
  142. (22:20:37) kiradess: good luck
  143. (22:20:38) emeraldgreen1: I mean enter
  144. (22:21:05) emeraldgreen1: *sleepy* later
  145. (22:21:09) kiradess: this is why I like tools over consumer products
  146. (22:21:46) kiradess: i'll put everything in place so you can steal your favorite actor's voice, but i take no responsibility
  147. (22:21:50) kiradess: later
  148. (22:22:06) emeraldgreen1: good point
  149. (22:39:32) kiradess: I'm digging these 90's websites http://www.speech.cs.cmu.edu/comp.speech/
  150. (22:39:44) kiradess: >Last Revision: 18:40 05-Sep-1997
  151. (22:40:16) emeraldgreen1: vintage
  152. (22:40:46) emeraldgreen1: I like vintage dotcom-era forums
  153. (22:46:45) apt-get__ [~apt-get@DF4F89B7.939532AD.443D2AAB.IP] entered the room.
  154. (22:48:04) apt-get__ is now known as apt-get
  155. (22:48:25) apt-get_ left the room (quit: Ping timeout: 240 seconds).
  156. (22:49:01) apt-get: now that's an example of good design kiradess
  157. (22:49:03) apt-get: :DDDDD
  158. (22:56:47) kiradess: damn right
  159. (28.10.2015 00:22:52) apt-get_ [~apt-get@A2D44C5A.4EA120BF.A8BCC5AB.IP] entered the room.
  160. (00:26:05) apt-get left the room (quit: Ping timeout: 240 seconds).
  161. (01:10:00) apt-get_ is now known as apt-get
  162. (01:58:28) emeraldgreen1: http://arxiv.org/abs/1510.07211 ambitious
  163. (01:58:46) emeraldgreen1: but looks like this is going to happen
  164. (02:17:59) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  165. (02:34:31) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  166. (02:34:31) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  167. (03:48:31) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  168. (13:18:39) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  169. (13:18:39) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  170. (13:58:19) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
  171. (15:30:52) kiradess: >become a reality in future decades
  172. (15:30:52) kiradess: key word
  173. (15:32:07) emeraldgreen: it's only a question of datasets and compute
  174. (15:33:02) emeraldgreen: in next 10 years we'll see a working (if simple) program generation from informal natural language description
  175. (15:34:00) kiradess: maybe
  176. (15:34:11) kiradess: it has the same problems chat bots do
  177. (15:34:38) kiradess: we can feed them tons of data, but like speech, most programs are actually completely unique and novel
  178. (15:35:04) kiradess: might be useful for boilerplate and skeletons of programs
  179. (15:36:17) kiradess: "like give me a generic java framework for this sort of problem" and you get a bunch of mostly empty classes in a common desing pattern
  180. (15:36:17) kiradess: design*
  181. (15:36:48) emeraldgreen: kiradess you have a valid concern but you underestimate the complexity of the internal representations learned by RNNs
  182. (15:37:24) emeraldgreen: chatbots are built with boring ngram models, they cannot learn nonlinear representations
  183. (15:37:35) kiradess: I just don't understand where the novelty is supposed to come from
  184. (15:37:41) emeraldgreen: they are basically simple tables
  185. (15:37:57) kiradess: it's like those NN programs they feed with classical music, then it makes new classical music
  186. (15:37:58) emeraldgreen: kiradessfrom generalization
  187. (15:38:32) kiradess: it doesn't start dropping dubstep beats because it learned all about music from that classical and is now inventing a new sound or some shit
  188. (15:38:51) emeraldgreen: you can test it - save 10% of your training dataset and check if your model generalizes to it
  189. (15:38:52) kiradess: therein lies the problem
  190. (15:39:33) kiradess: I don't understand this "generalization" then
  191. (15:40:03) emeraldgreen: the model is said to generalize if it works on data it has never seen
  192. (15:40:25) kiradess: that works for stuff like image and sound processing but it really can't be applied beyond a certain complexity
  193. (15:40:44) kiradess: where an image recognizer will only recognize other images
  194. (15:40:50) kiradess: and not sounds
  195. (15:41:14) kiradess: do you see what I mean? programming isn't a 1 dimensional input -> output problem
  196. (15:41:29) emeraldgreen: kiradess I'm not sure, look at http://karpathy.github.io/2015/05/21/rnn-effectiveness/ there is >Visualizing the predictions and the "neuron" firings in the RNN
  197. (15:41:29) emeraldgreen: section, it shows the sophistication of internal representation
  198. (15:41:42) emeraldgreen: >where an image recognizer will only recognize other images
  199. (15:41:42) emeraldgreen: yo have a point
  200. (15:41:53) kiradess: i hate that link
  201. (15:42:02) kiradess: all that guy shows are toy applications
  202. (15:42:03) emeraldgreen: but you can train the system on multimodal dataset
  203. (15:42:08) emeraldgreen: hehe
  204. (15:42:14) kiradess: to where the NN does the same shit fed back into it
  205. (15:42:28) kiradess: like "see a lot of house numbers, write a new house number"
  206. (15:42:40) kiradess: that's what I mean about complexity and depth
  207. (15:43:16) emeraldgreen: you are right, but you underestimate the power learning algorithmic patterns
  208. (15:43:23) emeraldgreen: *power of
  209. (15:43:29) kiradess: instead of a generalization, it often looks like an averaging or a random mixing of the input
  210. (15:43:32) emeraldgreen: RNNs can learn whole algorithms
  211. (15:43:56) kiradess: see the source code section, the shakespheare section
  212. (15:44:15) kiradess: the RNNs focus on the wrong thing
  213. (15:44:26) kiradess: and not what we want, which is meaningful output
  214. (15:44:44) kiradess: and they never will, because all of the information is not being supplied to them
  215. (15:44:58) kiradess: that's the problem with any NN-based NLP
  216. (15:45:34) emeraldgreen: >instead of a generalization, it often looks like an averaging or a random mixing of the input
  217. (15:45:34) emeraldgreen: I wouldn't say they are a linear averaging. Anyway there is a way to train a generative model that doesn't do just linear mixing: http://soumith.ch/eyescream/
  218. (15:45:34) emeraldgreen: look at http://soumith.ch/eyescream/images/gen_church.png
  219. (15:45:48) kiradess: humans do not encode all of the meaning into an utterance, only the bare minimum that will enable the receiver to fill in the rest from their experience, knowledge, the context, situation, etc
  220. (15:46:32) emeraldgreen: I agree about current performance (though standard ngrams wouldn't be able to learn even these regularities you saw in this blog) but > and they never will,
  221. (15:46:32) emeraldgreen: is too strong a statement
  222. (15:46:54) kiradess: programs do not have a human's inner life, cannot fill in those blanks, and home in on random noise and meaningless symbols in the transmission
  223. (15:47:03) emeraldgreen: >all of the information is not being supplied to them
  224. (15:47:03) emeraldgreen: multi-modal datasets
  225. (15:47:29) kiradess: "never" is too strong a statement when talking about an unknown future, but not on a flawed approach
  226. (15:48:36) emeraldgreen: kiradess I don't see a proof that rnns are flawed (that they fundamentally can't learn some things that humans can learn)
  227. (15:48:40) kiradess: i'm not dismissing RNNs, but they just can't be applied to everything, especially in the situations like I'm talking about, where we are literally saying shit like "darmok and jalad at tanagra"
  228. (15:49:03) kiradess: because they're not learning like we are
  229. (15:49:15) emeraldgreen: yup, they aren't universal yet, especially if you consider the computational cost
  230. (15:49:38) kiradess: no, it's like some kind of chinese room, without the books inside
  231. (15:50:26) kiradess: i'll wait and see
  232. (15:50:37) emeraldgreen: I'll think about it. Time to go outside and bike a little
  233. (15:50:41) kiradess: but nothing really inspires me to experiment with them myself
  234. (15:52:51) kiradess: I'm not even knocking them or those who research/use them, they just seem only useful for specific applications
  235. (15:53:23) kiradess: or as an intermediate step in transforming some raw input to a parseable form
  236. (16:05:27) kiradess: I don't know why I have to argue about them when I actually don't feel strongly about them one way or another
  237. (17:46:38) kiradess: http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
  238. (18:05:13) kiradess: can't get it working tho
  239. (20:22:55) kiradess: https://www.youtube.com/watch?v=3u4x1ZIHgrA
  240. (22:57:07) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  241. (22:58:49) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  242. (22:58:49) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  243. (23:07:46) emeraldgreen: I'm back. Nice video! I like actuated BJDs
  244. (23:10:30) kiradess: hope i didnt piss you off earlier
  245. (23:10:40) kiradess: i know you're into RNN stuff
  246. (23:12:52) kiradess: I honestly don't know enough about it to make a judgement either way
  247. (23:13:19) emeraldgreen: no, no
  248. (23:13:43) emeraldgreen: I'm slow at arguing becuse I'm slow at formulating my responses in english
  249. (23:13:50) emeraldgreen: reading is easier than writing
  250. (23:14:03) emeraldgreen: RNN isn't a panacea
  251. (23:16:11) kiradess: meanwhile I'm surveying speech synthesis software instead of fixing this 3d model
  252. (23:16:22) emeraldgreen: It's a good use of time as well
  253. (23:16:44) kiradess: the outlook is grim though, it all sounds so bad
  254. (23:17:13) emeraldgreen: I don't think it's that bad
  255. (23:17:25) emeraldgreen: Aigis sounded robotic in persona 3 and it was cool
  256. (23:17:38) kiradess: lol?
  257. (23:18:15) kiradess: she was voiced by sakamoto maaya on the jap version
  258. (23:18:33) emeraldgreen: I know, but she tried to mimic the robot voice
  259. (23:18:35) kiradess: was her normal voice iirc
  260. (23:18:40) emeraldgreen: hmm
  261. (23:18:43) emeraldgreen: maybe
  262. (23:18:45) kiradess: oh, now that you mention it
  263. (23:18:59) kiradess: very monotonic, expressionless
  264. (23:19:04) emeraldgreen: gnu speech reminds me of aigis a bit
  265. (23:19:30) kiradess: just playing the character role, vs actually putting her voice through a filter or something
  266. (23:20:15) emeraldgreen: https://youtu.be/gHMyW7Fr0y8?t=15 hehe
  267. (23:21:00) emeraldgreen: >tfw no tartarus to explore
  268. (23:21:33) kiradess: barf
  269. (23:21:54) kiradess: did the international versions have jap voice options? i played the original
  270. (23:22:22) emeraldgreen: idk, I played in english
  271. (23:22:32) emeraldgreen: more free language practice
  272. (23:23:46) kiradess: do you have a demo for gnuspeech? I can't find one atm
  273. (23:24:19) emeraldgreen: http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav
  274. (23:26:43) kiradess: oh that
  275. (23:28:20) emeraldgreen: Also there is this http://tts.speech.cs.cmu.edu:8083/
  276. (23:28:23) emeraldgreen: not so bad
  277. (23:29:56) emeraldgreen: about the RNN universality I still don't completely understand your point
  278. (23:30:26) emeraldgreen: >programs do not have a human's inner life, cannot fill in those blanks, and home in on random noise and meaningless symbols in the transmission
  279. (23:30:26) emeraldgreen: >because they're not learning like we are
  280. (23:32:28) kiradess: by "inner life" I may be misusing the expression but essentially it means the whole of your existence, your memories, your knowledge base, your goals long and short term, etc
  281. (23:32:59) kiradess: all human communication is just triggering a response based on those factors
  282. (23:33:44) kiradess: we can communicate at all because so much of our inner lives are common to us all, at least within ethnic groups, regions, language users, etc
  283. (23:34:33) kiradess: so a RNN or anything else that was only exposed to our communications, be it words, sounds, even images, would never draw the "correct" conclusions from it
  284. (23:34:48) kiradess: because it isn't there, anywhere, at all
  285. (23:34:54) emeraldgreen: We share a common internal model because we have grown/learned in mostly same environment
  286. (23:35:08) kiradess: basically
  287. (23:35:16) emeraldgreen: Ah, I understand, your argument is about symbol grounding
  288. (23:35:25) kiradess: not entirely
  289. (23:35:46) kiradess: that's one aspect, maybe, assuming we use platonic symbols for things
  290. (23:36:26) emeraldgreen: Well, your point is that we cannot model human behavior by training our model on limited datasets that lack information available to humans
  291. (23:36:32) emeraldgreen: ?
  292. (23:36:38) kiradess: pretty much
  293. (23:36:56) kiradess: speech recognition is the same way
  294. (23:37:13) kiradess: we acheive very good results with some algorithms, say 97%+
  295. (23:37:33) kiradess: but in reality, humans only "hear" something like 80% of what is said to them
  296. (23:37:39) emeraldgreen: But if we have a very powerful model (say, global search of all possible algorithms with) and the right dataset, could we learn an agent that behaves similarly to a human?
  297. (23:38:05) kiradess: the rest is inferred, the gaps filled in, just plain made up, etc
  298. (23:38:23) kiradess: I wouldn't say so
  299. (23:38:35) emeraldgreen: I think it's because humans have much more context (priors in ML parlance)
  300. (23:38:53) kiradess: I'm in the camp that believes an AI would have to, at the very least, live a more or less human life, in order to think and act like one
  301. (23:39:39) emeraldgreen: kiradess so, hypothetically, in the space of all possible programs there aren't any that can model the behavior of some human (with finite precision of course) ?
  302. (23:40:22) kiradess: that's not a very useful hypothetical
  303. (23:40:42) kiradess: because yes, if you're talking about the infinite, than surely it does
  304. (23:40:56) emeraldgreen: I agree that we need to give our model human-like embodied experience to have chance at training human-like AI
  305. (23:41:41) kiradess: that's like when people say "the universe is (practically) infinite, so therefore, there is another planet out there exactly like this one, with someone exactly like you on it" it's retarded
  306. (23:41:48) emeraldgreen: I agree
  307. (23:41:55) emeraldgreen: but there is no need in infinity here
  308. (23:42:08) emeraldgreen: Human brain is a small finite physical object
  309. (23:42:21) kiradess: there is when you're trying to code in a human life from scratch
  310. (23:42:48) emeraldgreen: Isn't it true that the program that simulates the brain's behavior can/should be finite as well?
  311. (23:42:54) kiradess: it's gonna take an infinite amount of random arranging to get it right
  312. (23:43:06) kiradess: obviously
  313. (23:43:32) kiradess: so it's a matter of picking how it should be structured, how it should operate, etc
  314. (23:43:54) kiradess: but it's a combinatorics problem
  315. (23:44:21) emeraldgreen: kiradess My argument here is that its at least theoretically possible to take a recording of human experience and fit a finite algorithmic model to it. The obvious approach is genetic algorithm over any programming language. The model selection criterion is minimal description length.
  316. (23:44:48) kiradess: sounds like a tautology
  317. (23:44:52) emeraldgreen: Also it is doable in finite (if very large) time
  318. (23:45:19) kiradess: is it?
  319. (23:45:45) kiradess: for instance, if you wired a baby from birth to intercept all incoming nerve signals
  320. (23:45:46) emeraldgreen: ah, maybe it's not the full argument, I mean "... and the resulting model will be functionally similar to the human whose experience log we used to train it)"
  321. (23:45:52) emeraldgreen: yup
  322. (23:46:35) kiradess: theoretically, it should be
  323. (23:46:53) kiradess: the time and space complexity problem remains
  324. (23:47:07) emeraldgreen: Then we don't have any disagreements
  325. (23:47:18) kiradess: you claim a model could be fitted in a reasonably finite amount of time, but that remains to be seen
  326. (23:47:38) emeraldgreen: A sufficiently large RNN can model any given program, the proff is very simple.
  327. (23:48:33) emeraldgreen: kiradess Nope, I think its unreasonably large amount of time, if we use plain genetic programming (which is general, but slow).
  328. (23:48:36) kiradess: i assume you mean turing completeness?
  329. (23:49:32) emeraldgreen: kiradess Almost - RNNs don't have infinite memory (just like our desktop computers and brains though)
  330. (23:49:48) emeraldgreen: but in a more down-to-earth sense, yes
  331. (23:50:37) kiradess: that also assumes that RNNs could model what the brain does, in finite time and space
  332. (23:51:22) emeraldgreen: kiradess yup, it's built on the assumption that the human brain can be modeled by classic algorithm in finite time
  333. (23:51:28) emeraldgreen: *based
  334. (23:51:52) emeraldgreen: Don't think I'm another AI crank, lel
  335. (23:52:10) emeraldgreen: I don't say it's possible here and now, just in theory
  336. (23:52:14) kiradess: but because we don't know how the brain works, the best RNN may be no more effective than randomly guessing at configurations and testing them
  337. (23:52:27) emeraldgreen: kiradess Yup, there is a prior problem
  338. (23:53:21) kiradess: forgive me if the topic doesn't really excite me
  339. (23:53:49) emeraldgreen: Human brain has lots(?) of built-in evolutionary biases that are absent (or replaced by our own simple architectural biases) in RNNs and other models.
  340. (23:53:58) emeraldgreen: Ok, I won't bother you then
  341. (23:54:46) emeraldgreen: I just feel that engineering behaviors by hand is very tedious
  342. (23:56:03) emeraldgreen: maybe there will be a compromise though
  343. (23:57:02) kiradess: it is tedious
  344. (23:57:21) kiradess: hell, maybe humans don't even learn like we think we do
  345. (23:57:30) kiradess: and we've all been trained by hand
  346. (23:57:37) emeraldgreen: kiradess maybe lel
  347. (23:57:55) emeraldgreen: neuroscience knows only so much
  348. (23:57:57) kiradess: you've heard of feral children, or children locked in rooms for years with no human contact?
  349. (23:58:12) kiradess: and how they grow up mentally retarded as a result
  350. (23:58:24) emeraldgreen: kiradess yup, I heared that their brains were underdeveloped,
  351. (23:58:57) kiradess: maybe we arent as capable of unsupervised learning as we thought, and that part of raising a child is to configure their NNs
  352. (23:59:13) kiradess: in such a way that they can learn and function independently thereafter
  353. (23:59:52) emeraldgreen: kiradess there are known critical periods for learning various skills
  354. (29.10.2015 00:00:03) kiradess: yep
  355. (00:00:23) kiradess: that might be a neural plasticity issue too, idk
  356. (00:00:29) emeraldgreen: but there are also experiments that show the universality of neocortical learning
  357. (00:01:01) emeraldgreen: well known experiment with ferrets http://home.fau.edu/lewkowic/web/SUR.PDF
  358. (00:03:05) kiradess: hawkins says the same, that the neocortex is largely homogenous
  359. (00:03:14) emeraldgreen: yup, I like his theory
  360. (00:03:31) emeraldgreen: general AI is a hard problem and it's not our problem anyway
  361. (00:03:45) kiradess: yeah
  362. (00:04:10) kiradess: I haven't given much thought to a right-now solution though thb
  363. (00:04:12) kiradess: tbh*
  364. (00:04:26) emeraldgreen: solution to which exact problem?
  365. (00:04:57) kiradess: a rudimentary waifu AI
  366. (00:05:06) kiradess: in terms of NLP stuff
  367. (00:05:21) kiradess: this was interesting, if what he claims to be able to do is true http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
  368. (00:05:38) emeraldgreen: yes it is
  369. (00:05:40) kiradess: I can't get the dependencies working right though, so I haven't tried it
  370. (00:05:50) emeraldgreen: I haven't as well, never had the time
  371. (00:06:40) kiradess: well it's interesting because it uses mostly off-the-shelf software components
  372. (00:07:11) emeraldgreen: http://smerity.com/articles/2015/keras_qa.html this one is (or isn't) similar QA system
  373. (00:07:18) kiradess: though this guy http://spacy.io/blog/dead-code-should-be-buried/ claims the stanford parser used in it is a piece of shit
  374. (00:08:57) apt-get_ [~apt-get@Rizon-4931578C.adsl196-12.iam.net.ma] entered the room.
  375. (00:09:18) kiradess: what i'd like to know is, can RNN's be used in any realtime capacity?
  376. (00:09:39) kiradess: can they both simultaneously learn and produce output?
  377. (00:10:01) kiradess: how fast can a new input affect it, and thus, the output?
  378. (00:10:13) kiradess: because with humans, you're talking milliseconds
  379. (00:10:51) kiradess: everything I read about RNNs tends to talk in minutes to days, of full-out 8 core computing
  380. (00:11:49) emeraldgreen: kiradess Nope, training is >1000x harder than running the model. But you can train your model to use the short term memory.
  381. (00:12:14) kiradess: i don't follow
  382. (00:12:16) apt-get left the room (quit: Ping timeout: 240 seconds).
  383. (00:12:26) emeraldgreen: hah! 8-core computing is low-tier, you need a Nvidia Titan X here.
  384. (00:12:34) emeraldgreen: (or some other decent gpu)
  385. (00:12:38) kiradess: I was under the impression that they were all "train then use"
  386. (00:13:49) kiradess: that after the training phase, new input is simply processed and spit out as output, without affecting the model in any way
  387. (00:13:50) emeraldgreen: kiradess I mean training an RNN model is offline and takes days. The model itself is fast though, 100ms per timestep on the CPU
  388. (00:14:14) emeraldgreen: kiradess RNNs have a state vector, it is a short term memory
  389. (00:14:59) emeraldgreen: in the previous link RNNs are trained to answer questions using evidence from input that came in tens of timesteps ago
  390. (00:15:27) emeraldgreen: real long term memory is still an ongoing research problem though
  391. (00:16:10) emeraldgreen: btw, how do you even train your reasoning model online, how do you know that its action was right?
  392. (00:16:39) emeraldgreen: (well, if not reasoning, then conversation model)
  393. (00:16:59) kiradess: idk, but it's gotta be done
  394. (00:17:07) emeraldgreen: I know that predictive models (they just predict next input based on current input) are trained online
  395. (00:17:27) kiradess: you can't train a model offline then hope it keeps up in a continued online state
  396. (00:17:35) kiradess: it would never learn anything new
  397. (00:17:42) emeraldgreen: kiradess yup
  398. (00:18:04) emeraldgreen: episodic memory is an unsolved problem
  399. (00:18:13) kiradess: i'd suggest a kind of very reserved personality model
  400. (00:18:25) kiradess: that just wants to listen and absorb information
  401. (00:18:30) emeraldgreen: well to be fair DQN model really has longterm memory, but it's a general AI and it's not for us
  402. (00:18:38) emeraldgreen: kiradess yup
  403. (00:18:44) emeraldgreen: maybe even a hardcoded one
  404. (00:18:50) emeraldgreen: just a user profile
  405. (00:18:55) kiradess: and ask questions about everything
  406. (00:20:22) kiradess: i'd rather have an AI that says "I'm not what you mean by X" after everything sentence I say, than one that returns markov-chain-esque gibberish
  407. (00:20:45) emeraldgreen: hehe, I agree, I think that's what the user wants as well
  408. (00:21:08) kiradess: as long as it could use whatever grammar and language rules inherent to the language, plus some kind of data modeling/question answering like I linked
  409. (00:21:20) kiradess: I think it would sound ok
  410. (00:21:23) emeraldgreen: but for my own use I'd like an AI that would sometimes amuse myself with unexpected generalizations
  411. (00:21:41) emeraldgreen: >as long as it could use whatever grammar and language rules inherent to the language, plus some kind of data modeling/question answering like I linked
  412. (00:21:41) emeraldgreen: hello Cyc!
  413. (00:22:25) kiradess: one of those
  414. (00:22:34) kiradess: don't those guys always kill themselves?
  415. (00:22:42) emeraldgreen: what I'm afraid is sinking a year or two into handcrafting a brittle Ai that will never work
  416. (00:22:50) emeraldgreen: kiradess I mean Cyc Corp
  417. (00:23:08) kiradess: yeah this https://en.wikipedia.org/wiki/Cyc
  418. (00:23:26) kiradess: I was referring to a couple of other AI researchers who tried similar projects
  419. (00:23:34) kiradess: forget their names now
  420. (00:23:49) emeraldgreen: ah
  421. (00:24:00) emeraldgreen: you see, symbolic AI died for a reason
  422. (00:24:47) emeraldgreen: we have to constrain our development so we won't repeat their errors (trying to do too much with a handcoded symbolic ai)
  423. (00:25:02) kiradess: https://en.wikipedia.org/wiki/Chris_McKinstry this guy and push signh
  424. (00:26:21) kiradess: they both had projects that tried to crowdsource knowledge by having people submit facts to them
  425. (00:26:45) emeraldgreen: yup
  426. (00:27:19) emeraldgreen: the only hope for symbolic Ai to appear to be working is to radically constrain its interaction domain and use artistic skills to mask its weaknesses
  427. (00:27:26) emeraldgreen: just like it's done in computer games
  428. (00:29:44) kiradess: hm, i'm too tired to come up with any brilliant schemes right now but
  429. (00:29:44) kiradess: i'd like to try a kind of hybrid approach
  430. (00:29:44) kiradess: take audio in, pass it through a recognizer to get the transcription
  431. (00:29:44) kiradess: pass the transcription to a part of speech tagger
  432. (00:29:53) emeraldgreen: ok
  433. (00:29:56) emeraldgreen: I'm afk
  434. (00:30:03) kiradess: ok
  435. (00:31:16) kiradess: use the now tagged words to build a sentence map, like in http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
  436. (00:31:42) kiradess: from there, extract all the information possible
  437. (00:32:21) kiradess: try to distinguish between absolute and temporal facts
  438. (00:32:43) kiradess: maybe, idk about that one
  439. (00:32:57) kiradess: but I think it's important
  440. (00:33:20) kiradess: for instance, if I said "oh btw, the earth is round", that will still be true tomorrow
  441. (00:33:46) kiradess: but if I said "oh btw, dinner's ready", that most likely won't be true at some random point in the future
  442. (00:35:46) kiradess: then categorize the received input ala this paper https://web.stanford.edu/~jurafsky/ws97/CL-dialog.pdf
  443. (00:36:01) kiradess: to understand if a fact is being stated, a question is being asked, etc
  444. (00:36:28) kiradess: this is where most chat bots fall flat, they always respond, regardless of whether a response is warranted or not
  445. (00:36:57) kiradess: then take all the information on hand to foirmulate a response
  446. (00:37:28) kiradess: knowledge on hand, current context, what was just said, etc
  447. (00:38:02) kiradess: model short term memory like a cache of past events, weighted by their age
  448. (00:39:43) kiradess: so you can avoid saying the same sentence twice, or respond in some way to receiving the same sentence as input in quick succession
  449. (00:39:43) kiradess: like, "I just answered that question. Aren't you listening to me?"
  450. (00:40:23) kiradess: it would take a whole bunch of tuning and experimenting to see which approach works best for which aspect
  451. (00:40:27) apt-get_ left the room (quit: Read error: Connection reset by peer).
  452. (00:40:37) kiradess: different statistical approaches, etc
  453. (00:40:59) apt-get_ [~apt-get@7E2E9DB4.557EA84B.257419B5.IP] entered the room.
  454. (00:41:13) kiradess: you could statistically model anything really
  455. (00:41:50) kiradess: take this paper I just linked, you could build some kind of model out of the order in which dialogue acts occur
  456. (00:43:19) kiradess: and infer that questions should be followed by answers, disagreement followed by a clarifying question, etc
  457. (00:44:38) kiradess: you could say a RNN could model that, but it wouldnt adapt to the individual or give rise to unique relations between user and waifu
  458. (00:45:42) kiradess: for instance, a couple who are argumentative, a couple that doesn't rely on words much, where the user does most of the talking, or vice-versa, etc
  459. (00:46:15) kiradess: hit some sort of balance between imitating and complimenting the user
  460. (00:47:26) kiradess: or, you could use RNNs, and use a mandatory "sleep" time for training
  461. (00:47:27) kiradess: which some say humans do a form of, moving short-term memories to long-term and who knows what else when we sleep
  462. (00:47:44) kiradess: but whatever, im rambling, I gotta sleep
  463. (00:50:22) kiradess: speaking of context, when I think of context, I imagine a cache of all sensory input and output (what is heard, what is said, what is seen, what is at hand, what is being done, etc) for N number of snapshots extending into the past
  464. (00:50:22) kiradess: however many can be reasonably searched and worked upon
  465. (00:50:59) kiradess: with every element weighted based on it's timestamp's age/and or difference from it's current counterpart
  466. (00:51:26) kiradess: let's say you've got 10 snapshots in "memory", and we're looking at only the "location" field
  467. (00:52:10) kiradess: the last 5 are "house", the 5 before that are "the local park"
  468. (00:52:37) kiradess: if the current location is "house", that should effect the weight of the entire snapshot
  469. (00:52:57) kiradess: or at least the fields that have a location component of any kind
  470. (00:53:18) kiradess: are dependent on location in some way
  471. (00:53:43) kiradess: but yeah, idk
  472. (03:24:46) apt-get_ left the room (quit: Remote host closed the connection).
  473. (03:54:49) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
  474. (03:55:17) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
  475. (03:55:17) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
  476. (03:57:51) emeraldgreen: we should write it down, these are good ideas
  477. (03:58:06) emeraldgreen: >from there, extract all the information possible
  478. (03:58:06) emeraldgreen: Ai-complete problem
  479. (03:58:18) emeraldgreen: >try to distinguish between absolute and temporal facts
  480. (03:58:18) emeraldgreen: doable in a limited domain
  481. (03:58:39) emeraldgreen: >to understand if a fact is being stated, a question is being asked, etc
  482. (03:58:39) emeraldgreen: need a list of categories of phrases
  483. (03:58:56) emeraldgreen: >then take all the information on hand to foirmulate a response
  484. (03:58:56) emeraldgreen: AI-complete problem
  485. (04:00:24) emeraldgreen: >knowledge on hand, current context, what was just said, etc
  486. (04:00:24) emeraldgreen: This is a good idea, we started to design the behavior system around the context too. But it requires formalization of all features that can be present in the context.
  487. (04:01:13) emeraldgreen: >it would take a whole bunch of tuning and experimenting to see which approach works best for which aspect
  488. (04:01:13) emeraldgreen: It's impossible to do by hand, we should optimize params with some optimization algorithm. Optimizing by hand is a road to nowhere, really.
  489. (04:03:58) emeraldgreen: >you could statistically model anything really
  490. (04:03:58) emeraldgreen: I agree but in reality pure statistical models are either too shallow (naive bayes) ,intractable (full bayesian inference) or they just large amount of domain expertize (to design small graph of conditional dependency of latent variables) and huge computational requirements (because sampling).
  491. (04:03:58) emeraldgreen: Probabilistic programming is an interesting aproach though.
  492. (04:05:00) emeraldgreen: >you could say a RNN could model that, but it wouldnt adapt to the individual or give rise to unique relations between user and waifu
  493. (04:05:00) emeraldgreen: In practice ANN are able to model much more complex distributions that pure probabilistic approaches (that's why we don't see prob. based approaches winning competitions)
  494. (04:05:52) emeraldgreen: >or, you could use RNNs, and use a mandatory "sleep" time for training
  495. (04:05:52) emeraldgreen: Probabilistic models with latent variables require lots and lots of sampling (training) to tune their parameters
  496. (04:07:06) emeraldgreen: >speaking of context
  497. (04:07:06) emeraldgreen: I see two models for context: 1) handcrafted list of features 2) distributed representation like word2vec
  498. (04:07:06) emeraldgreen: >but whatever, im rambling, I gotta sleep
  499. (04:07:06) emeraldgreen: See ya later!
  500. (04:07:59) emeraldgreen: >with every element weighted based on it's timestamp's age/and or difference from it's current counterpart
  501. (04:07:59) emeraldgreen: And here you have introduced 2*N weights (age, diff) into your model, which will require extensive training to find their optimal values
  502. (04:08:22) emeraldgreen: See ya tomorrow in this chat!
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement