Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- (17:42:54) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (17:42:54) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (18:24:40) apt-get_ [~apt-get@D97330BC.55ADF584.217FDA43.IP] entered the room.
- (18:55:20) kiradess: hating lua right now
- (18:57:21) kiradess: just all around useless as a general scripting language
- (19:08:33) emeraldgreen: it's less bloated than angelscript
- (19:08:41) emeraldgreen: and fast
- (19:15:44) kiradess: not saying it's bad for game engine stuff
- (19:16:00) kiradess: but fuck if it doesn't have 2% the functionality of python
- (19:16:10) kiradess: feels like i'm writing C
- (19:16:29) emeraldgreen: it's also used for cutting edge deep learning (torch), outcompeting python in that space
- (19:16:39) kiradess: I mean for general stuff, I was fucking around with rewriting a bunch of scripts in it for practice
- (19:17:12) kiradess: and it lacked some really basic math and string stuff, it was weird
- (19:17:52) emeraldgreen: hmm, maybe there are libs for that
- (19:18:14) emeraldgreen: I stopped caring about languages because I write JS, lua and python sometimes
- (19:21:44) kiradess: same, lua wouldnt have even been on my radar if it wasn't for urho3d
- (19:23:01) emeraldgreen: I wrote a little lua when I modded S.T.A.L.K.E.R., also I read its sources when I tried to create my own small dynamic language
- (19:24:11) kiradess: cool
- (19:24:54) emeraldgreen: it'd be cool if Urho had built-in node.js
- (19:25:13) emeraldgreen: +performance +popularity
- (19:30:57) kiradess: dont think node.js has 97% the speed of C
- (19:31:25) emeraldgreen: it comes close at factor 2-3
- (19:31:44) emeraldgreen: V8 is the fastest dynamic language compiler in the world
- (19:32:20) emeraldgreen: luajit used to hold that title but now it's 2x slower than V8
- (19:32:55) kiradess: cool
- (21:10:26) apt-get_: create a new JIT compiler that has 95% the speed of C in python
- (21:10:26) apt-get_: revolutionize the programming world
- (21:10:26) apt-get_: for waifus
- (21:53:46) kiradess: nah
- (22:08:28) emeraldgreen: apt-get_ there are 3 (!) different JIT compilers for python
- (22:16:58) apt-get_: are there emeraldgreen
- (22:17:01) apt-get_ is now known as apt-get
- (22:17:05) apt-get: huh
- (22:17:22) apt-get: I know of pypy but I thought it was the only one
- (22:18:00) emeraldgreen: pypy, pyston and numba
- (27.10.2015 00:55:40) apt-get left the room (quit: Remote host closed the connection).
- (01:32:07) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
- (03:21:25) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (11:08:51) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (11:08:51) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (12:14:35) kiradess: I was thinking about speech synthesis recently
- (12:14:45) kiradess: most software for creating a voice model relies on a speaker recording themselves saying a series of set phrases
- (12:15:12) kiradess: some few hundred/few thousand words in total, covering all the common diphones/triphones
- (12:27:25) kiradess: anyway
- (12:27:56) kiradess: that really only works if you've got a voice people can stand to listen to, and a lot of time on your hands
- (12:28:44) kiradess: I was wondering if anyone has attempted to train a voice model on unlabeled data, by means of speech recognition
- (12:30:23) kiradess: instead of say.. hundreds or thousands of carefully labeled and sliced words, using something like hundreds of HOURS of someone's podcasts or other source of speech without background noise
- (12:30:48) kiradess: take the slowest, most accurate speech parser, run it over some audio, keep only the sections it recognized with high confidence
- (12:31:35) kiradess: repeat until you have enough samples for one of the voice modelers out there like festvox
- (12:38:37) kiradess: I know it's done for recognition
- (12:39:11) kiradess: that is, train a recognition model on a large amount of unlabeled audio, somehow
- (12:39:49) kiradess: I just don't know if anyone has taken the identified, and now labeled, audio segments from that process and fed them into a voice modeler
- (12:46:51) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (12:47:13) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (12:47:13) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (12:49:26) emeraldgreen left the room (quit: Ping timeout: 240 seconds).
- (12:51:12) kiradess: either I'm not searching with the right terms or it's a shit idea and no has tried it
- (13:27:04) kiradess: not thrilled to be hung up on 3d stuff when i'd really like to dive into speech and NLP stuff
- (13:27:32) kiradess: python has some great libraries for that i've been meaning to use, like NLTK
- (13:47:45) kiradess: http://lands.let.ru.nl/cgn/publs/2002_06.pdf this
- (13:48:40) kiradess: >Spoken Dutch Corpus project
- (13:48:41) kiradess: kek
- (13:51:14) kiradess: >For the speech technologist segmentations are indispensable for the initial training of acoustic ASR models, the development of TTS systems and speech research in general.
- (13:55:14) kiradess: I think it's a good idea, especially since error rate is not that important
- (13:56:05) kiradess: a lot of what I've been reading is people using automatic speech recognition to attempt to segment and label large bodies of raw audio
- (13:56:21) kiradess: but their focus has been on reducing the error rate as much as possible
- (13:56:44) kiradess: in my use, any ambiguous utterances can just be discarded
- (13:57:54) kiradess: you might have a spoken sentence of 12 words, of which 8 were recognized with 99.7%+ confidence or something, from which you obtain 5 new diphones for your voice model
- (13:58:14) kiradess: is how I imagine it playing out
- (14:03:53) kiradess: >However, the confidence intervals may still be useful for other applications such as TTS systems for which the segments with reliable boundary positions can be selected automatically.
- (15:35:37) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
- (16:05:38) apt-get_ [~apt-get@E941CCAA.AE17DAB.A8BCC5AB.IP] entered the room.
- (16:08:51) apt-get left the room (quit: Ping timeout: 240 seconds).
- (16:12:11) apt-get__ [~apt-get@7D5356E7.DE55F473.BEE40085.IP] entered the room.
- (16:15:20) apt-get_ left the room (quit: Ping timeout: 240 seconds).
- (16:25:56) apt-get__ is now known as apt-get
- (19:22:48) emeraldgreen1: kiradess You have a good idea about speech synthesis
- (19:24:35) emeraldgreen1: *bought a spare used PC (c2d 2.4 ghz) + lcd monitor for 50$*
- (19:24:40) emeraldgreen1: feels good
- (19:28:09) kiradess: nice
- (19:29:15) kiradess: i have a pc sitting around i'm not using either, its an amd e-series apu
- (19:29:17) kiradess: dual core, 1.8ghz, sips power
- (19:29:28) kiradess: been meaning to make it a server for something but meh
- (19:29:43) emeraldgreen1: my main machine is 6-core amd
- (19:29:44) kiradess: oh, speech shit
- (19:29:58) kiradess: i got a pentium g3258
- (19:30:01) emeraldgreen1: I will probably use this new pc for work
- (19:30:16) kiradess: so speech, yeah I thought it was a good idea
- (19:30:26) kiradess: remains to be seen if it works in practice though
- (19:30:40) kiradess: need to read up on prior work
- (19:30:54) kiradess: automatic speech segmentation
- (19:30:59) emeraldgreen1: there is deep speech synthesis, but it hasn't outperformed traditional approaches yet
- (19:31:41) emeraldgreen1: e.g. http://research.microsoft.com/en-us/projects/dnntts/
- (19:32:20) kiradess: ill give it a look in a min
- (19:33:39) emeraldgreen1: >that really only works if you've got a voice people can stand to listen to, and a lot of time on your hands
- (19:33:39) emeraldgreen1: It should be possible to apply voicemorphing software to tts output, ofcourse some quality will be lost
- (19:34:40) emeraldgreen1: >ot thrilled to be hung up on 3d stuff when i'd really like to dive into speech and NLP stuff
- (19:34:40) emeraldgreen1: Then we need to make a good enough 3d and leave it at it
- (19:36:10) emeraldgreen1: >you might have a spoken sentence of 12 words, of which 8 were recognized with 99.7%+ confidence or something, from which you obtain 5 new diphones for your voice model
- (19:36:42) emeraldgreen1: It looks like a good idea, I thought about pirating lots of audiobooks and aligning them to corresponding texts
- (19:37:13) emeraldgreen1: training model is very heavy computation though
- (20:00:52) kiradess: ?
- (20:03:10) emeraldgreen1: you have to do a 1000 ops per byte of your dataset, probably more
- (20:03:28) kiradess: you mean the RNN way?
- (20:03:33) emeraldgreen1: To train on 100k hr dataset you will need a large cluster and a couple of weeks
- (20:03:52) kiradess: >100k hour dataset
- (20:03:57) kiradess: does such a thing even exist
- (20:04:11) emeraldgreen1: RNNs are very heavy, yes, but traditional HMM aren't lightwell either
- (20:04:18) emeraldgreen1: kiradess in baidu - yes
- (20:04:29) kiradess: ah baidu
- (20:04:32) emeraldgreen1: also with dataset augumentation (add noise, transform etc)
- (20:04:50) kiradess: I never really gave it much thought beyond finding an open source toolkit that would do the job
- (20:05:36) kiradess: there was a link I posted before, festival i think it was
- (20:05:53) kiradess: where some of the voices were trained on very little data and sounded decent
- (20:05:58) emeraldgreen1: https://github.com/edobashira/speech-language-processing there is this large list
- (20:07:17) kiradess: http://www.cstr.ed.ac.uk/projects/festival/morevoices.html
- (20:07:17) kiradess: yeah I saw that, good stuff on there
- (20:07:34) kiradess: Scottish male - Alan (ARCTIC), Jon (2hr)
- (20:07:34) kiradess: English RP male - Nick (8hr), Roger (13hr), Korin (TIMIT, ~20mins)
- (20:07:34) kiradess: English RP female - Nina (3hr)
- (20:07:40) kiradess: 2hrs? 3hrs? what?
- (20:08:20) emeraldgreen1: tts doesn't require as much data as the inverse problem
- (20:09:18) kiradess: I guess not
- (20:09:29) kiradess: still, more data than I have at the present
- (20:09:57) emeraldgreen1: we need special data - a female voice in various real life situations
- (20:10:21) emeraldgreen1: also there is this new gnu speech project
- (20:13:32) kiradess: http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav lol
- (20:14:10) kiradess: its really all about the voice samples at this point
- (20:14:31) kiradess: unless you're going for some strong AI, indistinguishable-from-human approach
- (20:14:46) kiradess: for instance, the latest voiceroid is quite good
- (20:15:04) kiradess: as in, very natural sounding, even compared to the couple before it
- (20:16:18) kiradess: I actually own a physical copy of yukari's
- (20:16:27) kiradess: cost like 150usd to import the shit, after not being able to find a working torrent for probably 3 months after release
- (20:16:42) kiradess: ofc I don't use it anymore since I don't have a windows install
- (20:17:13) kiradess: http://www.ah-soft.com/voiceroid/kotonoha/index.html
- (20:18:27) kiradess: I don't know where I would go about getting decent english voice audio but my idea for jap was drama cds and other seiyuu stuff
- (20:23:13) kiradess: http://www.nicovideo.jp/watch/sm23462858
- (20:40:16) apt-get left the room (quit: Ping timeout: 240 seconds).
- (20:40:21) apt-get_ [~apt-get@CE6E29B4.3D888B11.CAF2BD85.IP] entered the room.
- (22:19:35) kiradess: http://festvox.org/bsv/x794.html
- (22:20:24) emeraldgreen1: kiradess makes me want to dict the dataset myself
- (22:20:37) kiradess: good luck
- (22:20:38) emeraldgreen1: I mean enter
- (22:21:05) emeraldgreen1: *sleepy* later
- (22:21:09) kiradess: this is why I like tools over consumer products
- (22:21:46) kiradess: i'll put everything in place so you can steal your favorite actor's voice, but i take no responsibility
- (22:21:50) kiradess: later
- (22:22:06) emeraldgreen1: good point
- (22:39:32) kiradess: I'm digging these 90's websites http://www.speech.cs.cmu.edu/comp.speech/
- (22:39:44) kiradess: >Last Revision: 18:40 05-Sep-1997
- (22:40:16) emeraldgreen1: vintage
- (22:40:46) emeraldgreen1: I like vintage dotcom-era forums
- (22:46:45) apt-get__ [~apt-get@DF4F89B7.939532AD.443D2AAB.IP] entered the room.
- (22:48:04) apt-get__ is now known as apt-get
- (22:48:25) apt-get_ left the room (quit: Ping timeout: 240 seconds).
- (22:49:01) apt-get: now that's an example of good design kiradess
- (22:49:03) apt-get: :DDDDD
- (22:56:47) kiradess: damn right
- (28.10.2015 00:22:52) apt-get_ [~apt-get@A2D44C5A.4EA120BF.A8BCC5AB.IP] entered the room.
- (00:26:05) apt-get left the room (quit: Ping timeout: 240 seconds).
- (01:10:00) apt-get_ is now known as apt-get
- (01:58:28) emeraldgreen1: http://arxiv.org/abs/1510.07211 ambitious
- (01:58:46) emeraldgreen1: but looks like this is going to happen
- (02:17:59) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (02:34:31) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (02:34:31) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (03:48:31) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (13:18:39) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (13:18:39) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (13:58:19) apt-get [~apt-get@ore.katana.bake.kizu.genji.monogatari] entered the room.
- (15:30:52) kiradess: >become a reality in future decades
- (15:30:52) kiradess: key word
- (15:32:07) emeraldgreen: it's only a question of datasets and compute
- (15:33:02) emeraldgreen: in next 10 years we'll see a working (if simple) program generation from informal natural language description
- (15:34:00) kiradess: maybe
- (15:34:11) kiradess: it has the same problems chat bots do
- (15:34:38) kiradess: we can feed them tons of data, but like speech, most programs are actually completely unique and novel
- (15:35:04) kiradess: might be useful for boilerplate and skeletons of programs
- (15:36:17) kiradess: "like give me a generic java framework for this sort of problem" and you get a bunch of mostly empty classes in a common desing pattern
- (15:36:17) kiradess: design*
- (15:36:48) emeraldgreen: kiradess you have a valid concern but you underestimate the complexity of the internal representations learned by RNNs
- (15:37:24) emeraldgreen: chatbots are built with boring ngram models, they cannot learn nonlinear representations
- (15:37:35) kiradess: I just don't understand where the novelty is supposed to come from
- (15:37:41) emeraldgreen: they are basically simple tables
- (15:37:57) kiradess: it's like those NN programs they feed with classical music, then it makes new classical music
- (15:37:58) emeraldgreen: kiradessfrom generalization
- (15:38:32) kiradess: it doesn't start dropping dubstep beats because it learned all about music from that classical and is now inventing a new sound or some shit
- (15:38:51) emeraldgreen: you can test it - save 10% of your training dataset and check if your model generalizes to it
- (15:38:52) kiradess: therein lies the problem
- (15:39:33) kiradess: I don't understand this "generalization" then
- (15:40:03) emeraldgreen: the model is said to generalize if it works on data it has never seen
- (15:40:25) kiradess: that works for stuff like image and sound processing but it really can't be applied beyond a certain complexity
- (15:40:44) kiradess: where an image recognizer will only recognize other images
- (15:40:50) kiradess: and not sounds
- (15:41:14) kiradess: do you see what I mean? programming isn't a 1 dimensional input -> output problem
- (15:41:29) emeraldgreen: kiradess I'm not sure, look at http://karpathy.github.io/2015/05/21/rnn-effectiveness/ there is >Visualizing the predictions and the "neuron" firings in the RNN
- (15:41:29) emeraldgreen: section, it shows the sophistication of internal representation
- (15:41:42) emeraldgreen: >where an image recognizer will only recognize other images
- (15:41:42) emeraldgreen: yo have a point
- (15:41:53) kiradess: i hate that link
- (15:42:02) kiradess: all that guy shows are toy applications
- (15:42:03) emeraldgreen: but you can train the system on multimodal dataset
- (15:42:08) emeraldgreen: hehe
- (15:42:14) kiradess: to where the NN does the same shit fed back into it
- (15:42:28) kiradess: like "see a lot of house numbers, write a new house number"
- (15:42:40) kiradess: that's what I mean about complexity and depth
- (15:43:16) emeraldgreen: you are right, but you underestimate the power learning algorithmic patterns
- (15:43:23) emeraldgreen: *power of
- (15:43:29) kiradess: instead of a generalization, it often looks like an averaging or a random mixing of the input
- (15:43:32) emeraldgreen: RNNs can learn whole algorithms
- (15:43:56) kiradess: see the source code section, the shakespheare section
- (15:44:15) kiradess: the RNNs focus on the wrong thing
- (15:44:26) kiradess: and not what we want, which is meaningful output
- (15:44:44) kiradess: and they never will, because all of the information is not being supplied to them
- (15:44:58) kiradess: that's the problem with any NN-based NLP
- (15:45:34) emeraldgreen: >instead of a generalization, it often looks like an averaging or a random mixing of the input
- (15:45:34) emeraldgreen: I wouldn't say they are a linear averaging. Anyway there is a way to train a generative model that doesn't do just linear mixing: http://soumith.ch/eyescream/
- (15:45:34) emeraldgreen: look at http://soumith.ch/eyescream/images/gen_church.png
- (15:45:48) kiradess: humans do not encode all of the meaning into an utterance, only the bare minimum that will enable the receiver to fill in the rest from their experience, knowledge, the context, situation, etc
- (15:46:32) emeraldgreen: I agree about current performance (though standard ngrams wouldn't be able to learn even these regularities you saw in this blog) but > and they never will,
- (15:46:32) emeraldgreen: is too strong a statement
- (15:46:54) kiradess: programs do not have a human's inner life, cannot fill in those blanks, and home in on random noise and meaningless symbols in the transmission
- (15:47:03) emeraldgreen: >all of the information is not being supplied to them
- (15:47:03) emeraldgreen: multi-modal datasets
- (15:47:29) kiradess: "never" is too strong a statement when talking about an unknown future, but not on a flawed approach
- (15:48:36) emeraldgreen: kiradess I don't see a proof that rnns are flawed (that they fundamentally can't learn some things that humans can learn)
- (15:48:40) kiradess: i'm not dismissing RNNs, but they just can't be applied to everything, especially in the situations like I'm talking about, where we are literally saying shit like "darmok and jalad at tanagra"
- (15:49:03) kiradess: because they're not learning like we are
- (15:49:15) emeraldgreen: yup, they aren't universal yet, especially if you consider the computational cost
- (15:49:38) kiradess: no, it's like some kind of chinese room, without the books inside
- (15:50:26) kiradess: i'll wait and see
- (15:50:37) emeraldgreen: I'll think about it. Time to go outside and bike a little
- (15:50:41) kiradess: but nothing really inspires me to experiment with them myself
- (15:52:51) kiradess: I'm not even knocking them or those who research/use them, they just seem only useful for specific applications
- (15:53:23) kiradess: or as an intermediate step in transforming some raw input to a parseable form
- (16:05:27) kiradess: I don't know why I have to argue about them when I actually don't feel strongly about them one way or another
- (17:46:38) kiradess: http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
- (18:05:13) kiradess: can't get it working tho
- (20:22:55) kiradess: https://www.youtube.com/watch?v=3u4x1ZIHgrA
- (22:57:07) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (22:58:49) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (22:58:49) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (23:07:46) emeraldgreen: I'm back. Nice video! I like actuated BJDs
- (23:10:30) kiradess: hope i didnt piss you off earlier
- (23:10:40) kiradess: i know you're into RNN stuff
- (23:12:52) kiradess: I honestly don't know enough about it to make a judgement either way
- (23:13:19) emeraldgreen: no, no
- (23:13:43) emeraldgreen: I'm slow at arguing becuse I'm slow at formulating my responses in english
- (23:13:50) emeraldgreen: reading is easier than writing
- (23:14:03) emeraldgreen: RNN isn't a panacea
- (23:16:11) kiradess: meanwhile I'm surveying speech synthesis software instead of fixing this 3d model
- (23:16:22) emeraldgreen: It's a good use of time as well
- (23:16:44) kiradess: the outlook is grim though, it all sounds so bad
- (23:17:13) emeraldgreen: I don't think it's that bad
- (23:17:25) emeraldgreen: Aigis sounded robotic in persona 3 and it was cool
- (23:17:38) kiradess: lol?
- (23:18:15) kiradess: she was voiced by sakamoto maaya on the jap version
- (23:18:33) emeraldgreen: I know, but she tried to mimic the robot voice
- (23:18:35) kiradess: was her normal voice iirc
- (23:18:40) emeraldgreen: hmm
- (23:18:43) emeraldgreen: maybe
- (23:18:45) kiradess: oh, now that you mention it
- (23:18:59) kiradess: very monotonic, expressionless
- (23:19:04) emeraldgreen: gnu speech reminds me of aigis a bit
- (23:19:30) kiradess: just playing the character role, vs actually putting her voice through a filter or something
- (23:20:15) emeraldgreen: https://youtu.be/gHMyW7Fr0y8?t=15 hehe
- (23:21:00) emeraldgreen: >tfw no tartarus to explore
- (23:21:33) kiradess: barf
- (23:21:54) kiradess: did the international versions have jap voice options? i played the original
- (23:22:22) emeraldgreen: idk, I played in english
- (23:22:32) emeraldgreen: more free language practice
- (23:23:46) kiradess: do you have a demo for gnuspeech? I can't find one atm
- (23:24:19) emeraldgreen: http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav
- (23:26:43) kiradess: oh that
- (23:28:20) emeraldgreen: Also there is this http://tts.speech.cs.cmu.edu:8083/
- (23:28:23) emeraldgreen: not so bad
- (23:29:56) emeraldgreen: about the RNN universality I still don't completely understand your point
- (23:30:26) emeraldgreen: >programs do not have a human's inner life, cannot fill in those blanks, and home in on random noise and meaningless symbols in the transmission
- (23:30:26) emeraldgreen: >because they're not learning like we are
- (23:32:28) kiradess: by "inner life" I may be misusing the expression but essentially it means the whole of your existence, your memories, your knowledge base, your goals long and short term, etc
- (23:32:59) kiradess: all human communication is just triggering a response based on those factors
- (23:33:44) kiradess: we can communicate at all because so much of our inner lives are common to us all, at least within ethnic groups, regions, language users, etc
- (23:34:33) kiradess: so a RNN or anything else that was only exposed to our communications, be it words, sounds, even images, would never draw the "correct" conclusions from it
- (23:34:48) kiradess: because it isn't there, anywhere, at all
- (23:34:54) emeraldgreen: We share a common internal model because we have grown/learned in mostly same environment
- (23:35:08) kiradess: basically
- (23:35:16) emeraldgreen: Ah, I understand, your argument is about symbol grounding
- (23:35:25) kiradess: not entirely
- (23:35:46) kiradess: that's one aspect, maybe, assuming we use platonic symbols for things
- (23:36:26) emeraldgreen: Well, your point is that we cannot model human behavior by training our model on limited datasets that lack information available to humans
- (23:36:32) emeraldgreen: ?
- (23:36:38) kiradess: pretty much
- (23:36:56) kiradess: speech recognition is the same way
- (23:37:13) kiradess: we acheive very good results with some algorithms, say 97%+
- (23:37:33) kiradess: but in reality, humans only "hear" something like 80% of what is said to them
- (23:37:39) emeraldgreen: But if we have a very powerful model (say, global search of all possible algorithms with) and the right dataset, could we learn an agent that behaves similarly to a human?
- (23:38:05) kiradess: the rest is inferred, the gaps filled in, just plain made up, etc
- (23:38:23) kiradess: I wouldn't say so
- (23:38:35) emeraldgreen: I think it's because humans have much more context (priors in ML parlance)
- (23:38:53) kiradess: I'm in the camp that believes an AI would have to, at the very least, live a more or less human life, in order to think and act like one
- (23:39:39) emeraldgreen: kiradess so, hypothetically, in the space of all possible programs there aren't any that can model the behavior of some human (with finite precision of course) ?
- (23:40:22) kiradess: that's not a very useful hypothetical
- (23:40:42) kiradess: because yes, if you're talking about the infinite, than surely it does
- (23:40:56) emeraldgreen: I agree that we need to give our model human-like embodied experience to have chance at training human-like AI
- (23:41:41) kiradess: that's like when people say "the universe is (practically) infinite, so therefore, there is another planet out there exactly like this one, with someone exactly like you on it" it's retarded
- (23:41:48) emeraldgreen: I agree
- (23:41:55) emeraldgreen: but there is no need in infinity here
- (23:42:08) emeraldgreen: Human brain is a small finite physical object
- (23:42:21) kiradess: there is when you're trying to code in a human life from scratch
- (23:42:48) emeraldgreen: Isn't it true that the program that simulates the brain's behavior can/should be finite as well?
- (23:42:54) kiradess: it's gonna take an infinite amount of random arranging to get it right
- (23:43:06) kiradess: obviously
- (23:43:32) kiradess: so it's a matter of picking how it should be structured, how it should operate, etc
- (23:43:54) kiradess: but it's a combinatorics problem
- (23:44:21) emeraldgreen: kiradess My argument here is that its at least theoretically possible to take a recording of human experience and fit a finite algorithmic model to it. The obvious approach is genetic algorithm over any programming language. The model selection criterion is minimal description length.
- (23:44:48) kiradess: sounds like a tautology
- (23:44:52) emeraldgreen: Also it is doable in finite (if very large) time
- (23:45:19) kiradess: is it?
- (23:45:45) kiradess: for instance, if you wired a baby from birth to intercept all incoming nerve signals
- (23:45:46) emeraldgreen: ah, maybe it's not the full argument, I mean "... and the resulting model will be functionally similar to the human whose experience log we used to train it)"
- (23:45:52) emeraldgreen: yup
- (23:46:35) kiradess: theoretically, it should be
- (23:46:53) kiradess: the time and space complexity problem remains
- (23:47:07) emeraldgreen: Then we don't have any disagreements
- (23:47:18) kiradess: you claim a model could be fitted in a reasonably finite amount of time, but that remains to be seen
- (23:47:38) emeraldgreen: A sufficiently large RNN can model any given program, the proff is very simple.
- (23:48:33) emeraldgreen: kiradess Nope, I think its unreasonably large amount of time, if we use plain genetic programming (which is general, but slow).
- (23:48:36) kiradess: i assume you mean turing completeness?
- (23:49:32) emeraldgreen: kiradess Almost - RNNs don't have infinite memory (just like our desktop computers and brains though)
- (23:49:48) emeraldgreen: but in a more down-to-earth sense, yes
- (23:50:37) kiradess: that also assumes that RNNs could model what the brain does, in finite time and space
- (23:51:22) emeraldgreen: kiradess yup, it's built on the assumption that the human brain can be modeled by classic algorithm in finite time
- (23:51:28) emeraldgreen: *based
- (23:51:52) emeraldgreen: Don't think I'm another AI crank, lel
- (23:52:10) emeraldgreen: I don't say it's possible here and now, just in theory
- (23:52:14) kiradess: but because we don't know how the brain works, the best RNN may be no more effective than randomly guessing at configurations and testing them
- (23:52:27) emeraldgreen: kiradess Yup, there is a prior problem
- (23:53:21) kiradess: forgive me if the topic doesn't really excite me
- (23:53:49) emeraldgreen: Human brain has lots(?) of built-in evolutionary biases that are absent (or replaced by our own simple architectural biases) in RNNs and other models.
- (23:53:58) emeraldgreen: Ok, I won't bother you then
- (23:54:46) emeraldgreen: I just feel that engineering behaviors by hand is very tedious
- (23:56:03) emeraldgreen: maybe there will be a compromise though
- (23:57:02) kiradess: it is tedious
- (23:57:21) kiradess: hell, maybe humans don't even learn like we think we do
- (23:57:30) kiradess: and we've all been trained by hand
- (23:57:37) emeraldgreen: kiradess maybe lel
- (23:57:55) emeraldgreen: neuroscience knows only so much
- (23:57:57) kiradess: you've heard of feral children, or children locked in rooms for years with no human contact?
- (23:58:12) kiradess: and how they grow up mentally retarded as a result
- (23:58:24) emeraldgreen: kiradess yup, I heared that their brains were underdeveloped,
- (23:58:57) kiradess: maybe we arent as capable of unsupervised learning as we thought, and that part of raising a child is to configure their NNs
- (23:59:13) kiradess: in such a way that they can learn and function independently thereafter
- (23:59:52) emeraldgreen: kiradess there are known critical periods for learning various skills
- (29.10.2015 00:00:03) kiradess: yep
- (00:00:23) kiradess: that might be a neural plasticity issue too, idk
- (00:00:29) emeraldgreen: but there are also experiments that show the universality of neocortical learning
- (00:01:01) emeraldgreen: well known experiment with ferrets http://home.fau.edu/lewkowic/web/SUR.PDF
- (00:03:05) kiradess: hawkins says the same, that the neocortex is largely homogenous
- (00:03:14) emeraldgreen: yup, I like his theory
- (00:03:31) emeraldgreen: general AI is a hard problem and it's not our problem anyway
- (00:03:45) kiradess: yeah
- (00:04:10) kiradess: I haven't given much thought to a right-now solution though thb
- (00:04:12) kiradess: tbh*
- (00:04:26) emeraldgreen: solution to which exact problem?
- (00:04:57) kiradess: a rudimentary waifu AI
- (00:05:06) kiradess: in terms of NLP stuff
- (00:05:21) kiradess: this was interesting, if what he claims to be able to do is true http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
- (00:05:38) emeraldgreen: yes it is
- (00:05:40) kiradess: I can't get the dependencies working right though, so I haven't tried it
- (00:05:50) emeraldgreen: I haven't as well, never had the time
- (00:06:40) kiradess: well it's interesting because it uses mostly off-the-shelf software components
- (00:07:11) emeraldgreen: http://smerity.com/articles/2015/keras_qa.html this one is (or isn't) similar QA system
- (00:07:18) kiradess: though this guy http://spacy.io/blog/dead-code-should-be-buried/ claims the stanford parser used in it is a piece of shit
- (00:08:57) apt-get_ [~apt-get@Rizon-4931578C.adsl196-12.iam.net.ma] entered the room.
- (00:09:18) kiradess: what i'd like to know is, can RNN's be used in any realtime capacity?
- (00:09:39) kiradess: can they both simultaneously learn and produce output?
- (00:10:01) kiradess: how fast can a new input affect it, and thus, the output?
- (00:10:13) kiradess: because with humans, you're talking milliseconds
- (00:10:51) kiradess: everything I read about RNNs tends to talk in minutes to days, of full-out 8 core computing
- (00:11:49) emeraldgreen: kiradess Nope, training is >1000x harder than running the model. But you can train your model to use the short term memory.
- (00:12:14) kiradess: i don't follow
- (00:12:16) apt-get left the room (quit: Ping timeout: 240 seconds).
- (00:12:26) emeraldgreen: hah! 8-core computing is low-tier, you need a Nvidia Titan X here.
- (00:12:34) emeraldgreen: (or some other decent gpu)
- (00:12:38) kiradess: I was under the impression that they were all "train then use"
- (00:13:49) kiradess: that after the training phase, new input is simply processed and spit out as output, without affecting the model in any way
- (00:13:50) emeraldgreen: kiradess I mean training an RNN model is offline and takes days. The model itself is fast though, 100ms per timestep on the CPU
- (00:14:14) emeraldgreen: kiradess RNNs have a state vector, it is a short term memory
- (00:14:59) emeraldgreen: in the previous link RNNs are trained to answer questions using evidence from input that came in tens of timesteps ago
- (00:15:27) emeraldgreen: real long term memory is still an ongoing research problem though
- (00:16:10) emeraldgreen: btw, how do you even train your reasoning model online, how do you know that its action was right?
- (00:16:39) emeraldgreen: (well, if not reasoning, then conversation model)
- (00:16:59) kiradess: idk, but it's gotta be done
- (00:17:07) emeraldgreen: I know that predictive models (they just predict next input based on current input) are trained online
- (00:17:27) kiradess: you can't train a model offline then hope it keeps up in a continued online state
- (00:17:35) kiradess: it would never learn anything new
- (00:17:42) emeraldgreen: kiradess yup
- (00:18:04) emeraldgreen: episodic memory is an unsolved problem
- (00:18:13) kiradess: i'd suggest a kind of very reserved personality model
- (00:18:25) kiradess: that just wants to listen and absorb information
- (00:18:30) emeraldgreen: well to be fair DQN model really has longterm memory, but it's a general AI and it's not for us
- (00:18:38) emeraldgreen: kiradess yup
- (00:18:44) emeraldgreen: maybe even a hardcoded one
- (00:18:50) emeraldgreen: just a user profile
- (00:18:55) kiradess: and ask questions about everything
- (00:20:22) kiradess: i'd rather have an AI that says "I'm not what you mean by X" after everything sentence I say, than one that returns markov-chain-esque gibberish
- (00:20:45) emeraldgreen: hehe, I agree, I think that's what the user wants as well
- (00:21:08) kiradess: as long as it could use whatever grammar and language rules inherent to the language, plus some kind of data modeling/question answering like I linked
- (00:21:20) kiradess: I think it would sound ok
- (00:21:23) emeraldgreen: but for my own use I'd like an AI that would sometimes amuse myself with unexpected generalizations
- (00:21:41) emeraldgreen: >as long as it could use whatever grammar and language rules inherent to the language, plus some kind of data modeling/question answering like I linked
- (00:21:41) emeraldgreen: hello Cyc!
- (00:22:25) kiradess: one of those
- (00:22:34) kiradess: don't those guys always kill themselves?
- (00:22:42) emeraldgreen: what I'm afraid is sinking a year or two into handcrafting a brittle Ai that will never work
- (00:22:50) emeraldgreen: kiradess I mean Cyc Corp
- (00:23:08) kiradess: yeah this https://en.wikipedia.org/wiki/Cyc
- (00:23:26) kiradess: I was referring to a couple of other AI researchers who tried similar projects
- (00:23:34) kiradess: forget their names now
- (00:23:49) emeraldgreen: ah
- (00:24:00) emeraldgreen: you see, symbolic AI died for a reason
- (00:24:47) emeraldgreen: we have to constrain our development so we won't repeat their errors (trying to do too much with a handcoded symbolic ai)
- (00:25:02) kiradess: https://en.wikipedia.org/wiki/Chris_McKinstry this guy and push signh
- (00:26:21) kiradess: they both had projects that tried to crowdsource knowledge by having people submit facts to them
- (00:26:45) emeraldgreen: yup
- (00:27:19) emeraldgreen: the only hope for symbolic Ai to appear to be working is to radically constrain its interaction domain and use artistic skills to mask its weaknesses
- (00:27:26) emeraldgreen: just like it's done in computer games
- (00:29:44) kiradess: hm, i'm too tired to come up with any brilliant schemes right now but
- (00:29:44) kiradess: i'd like to try a kind of hybrid approach
- (00:29:44) kiradess: take audio in, pass it through a recognizer to get the transcription
- (00:29:44) kiradess: pass the transcription to a part of speech tagger
- (00:29:53) emeraldgreen: ok
- (00:29:56) emeraldgreen: I'm afk
- (00:30:03) kiradess: ok
- (00:31:16) kiradess: use the now tagged words to build a sentence map, like in http://blog.ayoungprogrammer.com/2015/09/a-simple-artificial-intelligence.html
- (00:31:42) kiradess: from there, extract all the information possible
- (00:32:21) kiradess: try to distinguish between absolute and temporal facts
- (00:32:43) kiradess: maybe, idk about that one
- (00:32:57) kiradess: but I think it's important
- (00:33:20) kiradess: for instance, if I said "oh btw, the earth is round", that will still be true tomorrow
- (00:33:46) kiradess: but if I said "oh btw, dinner's ready", that most likely won't be true at some random point in the future
- (00:35:46) kiradess: then categorize the received input ala this paper https://web.stanford.edu/~jurafsky/ws97/CL-dialog.pdf
- (00:36:01) kiradess: to understand if a fact is being stated, a question is being asked, etc
- (00:36:28) kiradess: this is where most chat bots fall flat, they always respond, regardless of whether a response is warranted or not
- (00:36:57) kiradess: then take all the information on hand to foirmulate a response
- (00:37:28) kiradess: knowledge on hand, current context, what was just said, etc
- (00:38:02) kiradess: model short term memory like a cache of past events, weighted by their age
- (00:39:43) kiradess: so you can avoid saying the same sentence twice, or respond in some way to receiving the same sentence as input in quick succession
- (00:39:43) kiradess: like, "I just answered that question. Aren't you listening to me?"
- (00:40:23) kiradess: it would take a whole bunch of tuning and experimenting to see which approach works best for which aspect
- (00:40:27) apt-get_ left the room (quit: Read error: Connection reset by peer).
- (00:40:37) kiradess: different statistical approaches, etc
- (00:40:59) apt-get_ [~apt-get@7E2E9DB4.557EA84B.257419B5.IP] entered the room.
- (00:41:13) kiradess: you could statistically model anything really
- (00:41:50) kiradess: take this paper I just linked, you could build some kind of model out of the order in which dialogue acts occur
- (00:43:19) kiradess: and infer that questions should be followed by answers, disagreement followed by a clarifying question, etc
- (00:44:38) kiradess: you could say a RNN could model that, but it wouldnt adapt to the individual or give rise to unique relations between user and waifu
- (00:45:42) kiradess: for instance, a couple who are argumentative, a couple that doesn't rely on words much, where the user does most of the talking, or vice-versa, etc
- (00:46:15) kiradess: hit some sort of balance between imitating and complimenting the user
- (00:47:26) kiradess: or, you could use RNNs, and use a mandatory "sleep" time for training
- (00:47:27) kiradess: which some say humans do a form of, moving short-term memories to long-term and who knows what else when we sleep
- (00:47:44) kiradess: but whatever, im rambling, I gotta sleep
- (00:50:22) kiradess: speaking of context, when I think of context, I imagine a cache of all sensory input and output (what is heard, what is said, what is seen, what is at hand, what is being done, etc) for N number of snapshots extending into the past
- (00:50:22) kiradess: however many can be reasonably searched and worked upon
- (00:50:59) kiradess: with every element weighted based on it's timestamp's age/and or difference from it's current counterpart
- (00:51:26) kiradess: let's say you've got 10 snapshots in "memory", and we're looking at only the "location" field
- (00:52:10) kiradess: the last 5 are "house", the 5 before that are "the local park"
- (00:52:37) kiradess: if the current location is "house", that should effect the weight of the entire snapshot
- (00:52:57) kiradess: or at least the fields that have a location component of any kind
- (00:53:18) kiradess: are dependent on location in some way
- (00:53:43) kiradess: but yeah, idk
- (03:24:46) apt-get_ left the room (quit: Remote host closed the connection).
- (03:54:49) The account has disconnected and you are no longer in this chat. You will automatically rejoin the chat when the account reconnects.
- (03:55:17) The topic for #simwaifu is: We have a trello board - https://trello.com/b/VaUT0Vko/waifu-simulation-project available to members only, sign up and let me know your username if you want an invite. Private bitbucket at https://bitbucket.org/kiradess/simwaifu same deal, need your username to add you.
- (03:55:17) Topic for #simwaifu set by kiradess!~quassel@Rizon-85974635.dynamic.ip.windstream.net at 14:53:37 on 15.10.2015
- (03:57:51) emeraldgreen: we should write it down, these are good ideas
- (03:58:06) emeraldgreen: >from there, extract all the information possible
- (03:58:06) emeraldgreen: Ai-complete problem
- (03:58:18) emeraldgreen: >try to distinguish between absolute and temporal facts
- (03:58:18) emeraldgreen: doable in a limited domain
- (03:58:39) emeraldgreen: >to understand if a fact is being stated, a question is being asked, etc
- (03:58:39) emeraldgreen: need a list of categories of phrases
- (03:58:56) emeraldgreen: >then take all the information on hand to foirmulate a response
- (03:58:56) emeraldgreen: AI-complete problem
- (04:00:24) emeraldgreen: >knowledge on hand, current context, what was just said, etc
- (04:00:24) emeraldgreen: This is a good idea, we started to design the behavior system around the context too. But it requires formalization of all features that can be present in the context.
- (04:01:13) emeraldgreen: >it would take a whole bunch of tuning and experimenting to see which approach works best for which aspect
- (04:01:13) emeraldgreen: It's impossible to do by hand, we should optimize params with some optimization algorithm. Optimizing by hand is a road to nowhere, really.
- (04:03:58) emeraldgreen: >you could statistically model anything really
- (04:03:58) emeraldgreen: I agree but in reality pure statistical models are either too shallow (naive bayes) ,intractable (full bayesian inference) or they just large amount of domain expertize (to design small graph of conditional dependency of latent variables) and huge computational requirements (because sampling).
- (04:03:58) emeraldgreen: Probabilistic programming is an interesting aproach though.
- (04:05:00) emeraldgreen: >you could say a RNN could model that, but it wouldnt adapt to the individual or give rise to unique relations between user and waifu
- (04:05:00) emeraldgreen: In practice ANN are able to model much more complex distributions that pure probabilistic approaches (that's why we don't see prob. based approaches winning competitions)
- (04:05:52) emeraldgreen: >or, you could use RNNs, and use a mandatory "sleep" time for training
- (04:05:52) emeraldgreen: Probabilistic models with latent variables require lots and lots of sampling (training) to tune their parameters
- (04:07:06) emeraldgreen: >speaking of context
- (04:07:06) emeraldgreen: I see two models for context: 1) handcrafted list of features 2) distributed representation like word2vec
- (04:07:06) emeraldgreen: >but whatever, im rambling, I gotta sleep
- (04:07:06) emeraldgreen: See ya later!
- (04:07:59) emeraldgreen: >with every element weighted based on it's timestamp's age/and or difference from it's current counterpart
- (04:07:59) emeraldgreen: And here you have introduced 2*N weights (age, diff) into your model, which will require extensive training to find their optimal values
- (04:08:22) emeraldgreen: See ya tomorrow in this chat!
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement