Experience-Based Language Acquisition: A Computational Model of Human Language AcquisitionUniversal-Publishers, 2002 - 144 lappuses Almost from the very beginning of the digital age, people have sought better ways to communicate with computers. This research investigates how computers might be enabled to understand natural language in a more humanlike way. Based, in part, on cognitive development in infants, we introduce an open computational framework for visual perception and grounded language acquisition called Experience-Based Language Acquisition (EBLA). EBLA can watch a series of short videos and acquire a simple language of nouns and verbs corresponding to the objects and object-object relations in those videos. Upon acquiring this protolanguage, EBLA can perform basic scene analysis to generate descriptions of novel videos. The general architecture of EBLA is comprised of three stages: vision processing, entity extraction, and lexical resolution. In the vision processing stage, EBLA processes the individual frames in short videos, using a variation of the mean shift analysis image segmentation algorithm to identify and store information about significant objects. In the entity extraction stage, EBLA abstracts information about the significant objects in each video and the relationships among those objects into internal representations called entities. Finally, in the lexical acquisition stage, EBLA extracts the individual lexemes (words) from simple descriptions of each video and attempts to generate entity-lexeme mappings using an inference technique called cross-situational learning. EBLA is not primed with a base lexicon, so it faces the task of bootstrapping its lexicon from scratch. |
Saturs
8 | |
COMPUTATIONAL MODELS OF PERCEPTUAL | 29 |
Model | 42 |
Segmentation | 61 |
CONCLUSIONS | 99 |
REFERENCES | 105 |
APPENDIX A LISTING OF EXPERIENCES PROCESSED | 111 |
APPENDIX B LISTING OF RESOURCES FOR THE EBLA | 118 |
SQL USED TO CONSTRUCT THE EBLA | 124 |
Bieži izmantoti vārdi un frāzes
AIBO algorithm animations attribute values audio Bailey ball hand Boolean flag indicating bowl centroid chapter cognitive color computational model computer vision computer vision system conceptual symbols contains CREATE INDEX cross-situational cube EBLA Model ebla_data database edge detection eed2.experience_id entities and lexemes entity extraction entity-lexeme mappings epigenesis evaluated event representations experience_entity_data experience_id experience_lexeme_data Figure frame FrameProcessor hand drop ring hand pickup ball hand roll ring hand touch ball image segmentation infants INT2 INTEGER DEFAULT internal representations Java Java Media Framework JavaDoc language acquisition learning Lexeme Mapping LexemeResolver lexical acquisition lexical resolution lexicon minimum standard deviation movie Norris and Hoffman NULL number of experiences object entities object-object relations Params class perceptual pixels polygon PostgreSQL PRIMARY KEY proprioception protolanguage protolanguage descriptions relation entities ring drop segmentation parameters significant objects Siskind spatial speech speech recognition speech synthesis stage success rates Sun Microsystems undersegmentation utterance VARCHAR vase verbs VirtualDub x-schema
Populāri fragmenti
126. lappuse - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
126. lappuse - CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE. DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28. lappuse - The highest level at which category members have similarly perceived overall shapes. The highest level at which a single mental image can reflect the entire category. The highest level at which a person uses similar motor actions for interacting with category members. The level at which subjects are fastest at identifying category members. The level with the most commonly used labels for category members. The first level named and understood by children. The first level to enter the lexicon of a...
108. lappuse - Mean Shift: A Robust Approach Toward Feature Space Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 24, pp.
126. lappuse - University nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE...
28. lappuse - The level with the most commonly used labels for category members. - The first level named and understood by children. - The first level to enter the lexicon of a language. - The level with the shortest primary lexemes. - The level at which terms are used in neutral contexts. For example, There's a dog on the porch...
29. lappuse - ... understand words, and around that birthday, they start to produce them. Words are usually produced in isolation; this one-word stage can last from two months to a year. For over a century, and all over the globe, scientists have kept diaries of their infants' first words, and the lists are almost identical. About half the words are for objects: food (juice, cookie), body parts (eye, nose), clothing (diaper, sock), vehicles (car, boat), toys (doll, block), household items (bottle, light), animals...
25. lappuse - ... produce the variety of vowel sounds used by adults. Not much of linguistic interest happens during the first two months, when babies produce the cries, grunts, sighs, clicks, stops, and pops associated with breathing, feeding, and fussing, or even during the next three, when coos and laughs are added. Between five and seven months babies begin to play with sounds, rather than using them to express their physical and emotional states, and their sequences of clicks, hums, glides, trills, hisses,...
27. lappuse - But there is no overall shape that you can assign to a generalized piece of furniture or a vehicle so that you could recognize the category from the shape. (2) The highest level at which a single mental image can represent the entire...
34. lappuse - Rule 6 If all word symbols in the utterance have converged on their actual conceptual-symbol sets, for each word symbol w in the utterance, remove from D(w) any conceptual expressions t, for which there do not exist possible conceptual expressions for the other word symbols in the utterance that can be given, as input, to COMPOSE (the composition routine), along with t, to yield, as its output, one of the remaining utterance meanings. (Siskindl997,61) Note that some words such as "the" in an original...