Tsuzumi, named after a Japanese hand drum used in traditional events, was launched last month for business use as the major telecom company seeks to catch up with foreign rivals in the fast-moving market.
In addition to being a multimodal AI model, tsuzumi has higher Japanese language processing capabilities than ChatGPT, a widely used AI chatbot developed by
With visual comprehension capabilities,
The functionality means it can also convert a document with many diagrams into text or calculate expenses based on taxi fare or meal receipts.
While AI platforms developed by overseas competitors do well in generating images or videos from text prompts and vice versa, parsing documents that contain diagrams and other media has been considered a challenge due to variations in file formats.
"If this technology becomes widely adopted by firms, productivity will improve by leaps and bounds," a developer at
==Kyodo
© Kyodo News International, Inc., source