G42 Launches Open-Source Hindi-English Language Model NANDA 87B with 87 Billion Parameters

In a groundbreaking development for the artificial intelligence landscape, G42, an Abu Dhabi-based AI group, has unveiled NANDA 87B, an open-source Hindi-English large language model boasting an impressive 87 billion parameters. This release marks a significant upgrade from its predecessor, the original NANDA model, and is poised to make waves in the AI community, particularly within India’s burgeoning tech ecosystem.

NANDA 87B is not just another language model; it represents a concerted effort to bridge linguistic gaps and provide advanced AI capabilities tailored specifically for Hindi speakers. Developed through a collaboration between the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Inception (a subsidiary of G42), and Cerebras, this model is built on the robust Llama-3.1 70B architecture. The collaborative nature of this project underscores the importance of cross-institutional partnerships in advancing AI technology.

One of the standout features of NANDA 87B is its training methodology. The model has been meticulously trained on over 65 billion Hindi tokens, utilizing a Hindi-centric tokenizer that enhances both training efficiency and inference accuracy. This focus on Hindi tokenization is crucial, as it allows the model to better understand and generate text that resonates with native speakers, thereby improving user experience and engagement.

The implications of NANDA 87B extend far beyond mere technical specifications. As Manu Jain, CEO of G42 India, aptly stated, “India deserves world-class technology that speaks its language.” This sentiment encapsulates the essence of the model’s purpose: to empower Indian users by providing them with AI tools that are not only sophisticated but also culturally relevant. By supporting formal Hindi, casual speech, and even Hinglish—a blend of Hindi and English commonly used in everyday conversations—NANDA 87B is designed to cater to a diverse audience, making advanced AI accessible to millions.

The model’s capabilities are extensive. NANDA 87B can perform a variety of tasks, including translation, summarization, instruction following, and transliteration. These functionalities are particularly valuable in educational settings, where students and educators can leverage the model for language learning and content creation. Additionally, businesses can utilize NANDA 87B to enhance customer interactions, streamline operations, and develop innovative solutions tailored to the Indian market.

Safety and cultural alignment have been prioritized in the design of NANDA 87B. G42 has taken significant steps to ensure that the model produces responsible outputs, reflecting the values and norms of Indian society. This focus on ethical AI is increasingly important in today’s digital landscape, where the potential for misuse of technology is a growing concern. By embedding safety measures into the model, G42 aims to foster trust among users and encourage widespread adoption.

The training of NANDA 87B was conducted on the Condor Galaxy, an advanced AI supercomputing system developed by G42 in partnership with Cerebras. This state-of-the-art infrastructure enables the processing of vast amounts of data, facilitating the complex computations required for training such a large model. The use of cutting-edge technology not only enhances the model’s performance but also positions G42 as a leader in the AI field, capable of pushing the boundaries of what is possible with language models.

Accessibility is another key aspect of NANDA 87B. The model is available as an open-weight resource on the MBZUAI Hugging Face page, allowing developers, creators, and businesses to utilize and expand its features. This open-source approach democratizes access to advanced AI technology, empowering a wider range of users to innovate and create applications that can benefit from the model’s capabilities. By fostering an open ecosystem, G42 encourages collaboration and knowledge sharing, which are essential for driving progress in the AI sector.

The launch of NANDA 87B comes at a time when the demand for AI-driven solutions is surging across various sectors in India. From education to entertainment and enterprise applications, the need for effective communication tools that can operate in multiple languages is more critical than ever. NANDA 87B is well-positioned to meet this demand, offering a versatile solution that can adapt to different contexts and user needs.

Moreover, the significance of this model extends beyond its immediate applications. NANDA 87B represents a broader trend in the AI industry towards creating models that are not only powerful but also inclusive. As the global tech landscape evolves, there is a growing recognition of the importance of linguistic diversity and cultural representation in AI development. By focusing on Hindi and catering to the unique linguistic characteristics of the Indian population, G42 is setting a precedent for future AI initiatives that prioritize inclusivity and accessibility.

The impact of NANDA 87B is likely to be felt across various domains. In education, for instance, the model can serve as a valuable resource for language learners, providing them with real-time feedback and assistance in their studies. Educators can leverage the model to create engaging content that resonates with students, enhancing the overall learning experience. In the business realm, companies can utilize NANDA 87B to improve customer service interactions, automate processes, and develop targeted marketing strategies that speak directly to their audience.

Furthermore, the entertainment industry stands to benefit significantly from the capabilities of NANDA 87B. Content creators can harness the model to generate scripts, dialogues, and narratives that reflect the nuances of Hindi and Hinglish, enriching the storytelling experience for audiences. As the demand for localized content continues to rise, NANDA 87B provides a powerful tool for creators looking to connect with their viewers on a deeper level.

As we look to the future, the launch of NANDA 87B signals a pivotal moment in the evolution of AI technology in India. It highlights the potential for AI to transcend language barriers and foster greater understanding among diverse populations. By equipping individuals and organizations with the tools they need to communicate effectively, G42 is contributing to a more interconnected and inclusive society.

In conclusion, the unveiling of NANDA 87B by G42 is a landmark achievement in the field of artificial intelligence. With its impressive array of features, commitment to cultural alignment, and open-source accessibility, NANDA 87B is set to revolutionize the way Hindi speakers interact with technology. As the model gains traction across various sectors, it will undoubtedly play a crucial role in shaping the future of AI in India and beyond. The journey towards a more inclusive and linguistically diverse AI landscape has begun, and NANDA 87B is leading the charge.