Natural language processing for japanese search engine

I want to build a search engine for Japanese. Japanese is difficult because there are no spaces between words, verbs are conjugated to show tense, negation, and politeness. Japanese is also tricky because it uses multiple character systems: two phonetic systems (one for words from Japanese, hiragana, and one for foreign words, katakana) and a symbolic systems (borrowed from Chinese).

What would you need to do first before you could create a tf-idf index that is different
from English?