About Me

I love both languages and computers. So naturally, my main interest is Machine Learning and Natural Language Processing. I have obsession with data and I'm curious about how things work, so I am very interested in all kind of machine learning algorithms, data mining and information extraction. It is fascinating what we can do with this today :)

I do science.


This requires both empirical creativity as well as empirical inquiry. I distill meaning from the data that is available, and know how to tell a statistically grounded, eschewing complexity and spurious correlations story through data.


My work usually involves finding data relationships using statistics, visualization tools and machine learning techniques. I create recommendation algorithms, often relying on intuition mixed with clever feature selection, combined with a deep understanding of large-scale highly available systems reality.


I apply supervised and unsupervised techniques in context of large data sets and latent variables.

I hold MSc from the Computer Science department at The University of Haifa, in the field of Natural Language Processing.

You can reach me at:
Email: danny@shach.am 
Cell: 972-52-4457130



Personal Links


What we live for...

In my spare time I love biking, mainly XC, using my GT bikes (and started to record my trips here). 

I also like to hike, and I used to guide at Sayarut Groups. One of my greatest loves is our Negev, yellow, open and quiet. I also try to travel as much as I can outside of Israel as well. Some Photos from my many tracks can be seen here. 

I use to go trekking with Radius trips mainly to Jordan. Some pictures can be found here and here.

I'm not much of an athlete but I like to see sport, especially football - Maccaby Haifa and Real Madrid

But my main time consumer is my magnificent family: 



We live far far away. In Kfar Kish.

Research

My primary research interests lie at the boundary between human languages and computer science, sometimes referred to as natural language processing.

My thesis is in the field of Natural Language Processing. It was done under the supervision of Dr.Shuly Wintner

Because of the complex morphology of Semitic languages, even shallow applications, such as search and information retrieval engines, require morphological analysis and disambiguation as a first step. The unique word formation machinery, along with the standard Hebrew orthography, which leaves most of the vowels unspecified, make morphological disambiguation of Hebrew a much more complex endeavor than the parallel POS tagging task for English.

Our approach in this research was to try to decouple the problem into simpler classification tasks and then combine the results in a sophisticated manner, taking into account the constraints that hold among the various components.

We presented the HAifa morphological DisAmbiguation System (HADAS). This system uses the output of a morphological analyzer and a limited linguistic knowledge, for disambiguating Hebrew morphologically annotated text. HADAS consists of several (currently, 10) simple classifiers and a module which combines them.



Projects and Groups

  • Computational Linguistics Group, research in diverse areas of computational linguistics and natural language processing.
  • Juru: A full-text search library.
  • QSIA: A collaborative e-learning and knowledge sharing infrastructure.


Publications

Conference Papers

Gennadi Lembersky, Danny Shacham and Shuly Wintner. Morphological Disambiguation of Hebrew: A Case Study in Classifier Combination. Natural Language Engineering, accepted for publication.

Danny Shacham and Shuly Wintner. Morphological Disambiguation of Hebrew: A Case Study in Classifier Combination. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 439-447, Prague, June 2007. [PDF Download]

Idan Szpektor, Ido Dagan, Alon Lavie, Danny Shacham and Shuly Wintner. Cross Lingual and Semantic Retrieval for Cultural Heritage Appreciation. In Proceedings of the ACL-2007 Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), pages 65-72, Prague, June 2007. [PDF Download]

DemoHebrew Morphological Disambiguation, IBM Haifa Research Labs, Information Retrieval Seminar, December 2005. [PDF]

TalkMorphological Disambiguation of Hebrew Using a Combination of Simple Classifiers, ISCOL, Israeli Seminar on Computational Linguistics, June 2005. [abstractPDF]

Thesis

Danny Shacham. Morphological Disambiguation of Hebrew. University of Haifa MSc. Thesis, 2007 [PDF Download]