ArticlesStartup launches tool to help businesses find their ideal generative AI solutions

Startup launches tool to help businesses find their ideal generative AI solutions

Artificial intelligence startup Arthur is helping businesses find the best generative text tools for their needs with a new open-source platform titled Arthur Bench.

The new service, released Thursday, August 17, evaluates output from various large language models (LLMs), algorithms used in tools such as OpenAI’s ChatGPT and Google’s Bard. With Arthur Bench, companies can compare results from different generative text platforms to see how each will behave given the same set of parameters, allowing them to find the perfect fit before incorporating AI into their businesses. The artificial intelligence startup simultaneously launched The Generative Assessment Project (GAP), a research effort aimed at identifying the pros and cons of competing LLMs. In its press release, Arthur reports that it has already discovered areas where Anthropic’s Claude-2 outperforms GPT-4. Through the GAP, the company can provide further updates and insights into industry-leading AI products.

With Aurthur Bench and the GAP, the artificial intelligence startup has become one of the first organizations to compare AI models. LLMs often differ in terms of reliability given certain subject matters and can create output with varying degrees of quality or ease, making it difficult for companies to identify compatible platforms and obtain consistent results. Even worse, since most generative text algorithms are constantly analyzing new data, responses to user queries can change over time. Not only will companies now be able to determine which service aligns best with their operations, but they can also stay informed on updates and gain insight into the types of prompts and guidelines needed to get their desired content. Priyanka Oberoi, staff data scientist at Axios HQ, noted that Arthur Bench had “helped us develop an internal framework to scale and standardize LLM evaluation across features, and to describe performance to the Product team with meaningful and interpretable metrics.”

The ability to study generative text platforms before deciding on a service could be a game-changer for businesses and may lead to wider adoption of AI tools among hesitant companies. The artificial intelligence startup could also be paving the way for more oversight in an emerging sector. Standards, regulations and best practices have yet to be written when it comes to LLMs and other algorithms. A comparative platform like Arthur Bench could prove crucial to making the technology more accessible and reliable.


ASBN Small Business NetworkASBN, from startup to success, we are your go-to resource for small business news, expert advice, information, and event coverage.

While you’re here, don’t forget to subscribe to our email newsletter for all the latest business news know-how from ASBN.

Colin Velez
Colin Velez
Colin Velez is a staff writer/reporter for ASBN. After obtaining his bachelor’s in Communication from Kennesaw State University in 2018, he kicked off his writing career by developing marketing and public relations material for various industries, including travel and fashion. Throughout the next four years, he developed a love for working with journalists and other content creators, and his passion eventually led him to his current position. Today, Colin writes news content and coordinates stories with auto-industry insiders and entrepreneurs throughout the U.S.

Related Articles

Entrepreneurs gear up for 2025: innovative strategies driving a new era of growth

As 2025 approaches, entrepreneurs are not just setting resolutions but laying out strategic plans to navigate a year poised for growth and innovation. A...