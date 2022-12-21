OpenAI, the non-profit artificial intelligence (AI) research company, has open-sourced Point-E - an AI tool that can generate 3D models from text prompts.

It means you can effectively generate 3D models in two minutes using text prompts like "a corgi wearing a red Santa hat" or "a multicoloured rainbow pumpkin".

While that sounds neat, there are some limitations to the technology. Let's take a deeper look at Point-E.

What is Point-E?

According to the research paper published along with the code base, Point-E is "A System for Generating 3D Point Clouds from Complex Prompts".

Let's break that down. The 'System' refers to the open-sourced tool that utilizes 'Complex Prompts', which are natural human text inputs, like - "a vase of purple flowers", to generate '3D Point Clouds'.

Point Clouds are rudimentary descriptions of coordinates in a given space. Extrapolating this data can create basic objects that represent a 3D shape. They are not 3D models per se and don't capture the texture and detail. The advantage of Point Clouds is that they require significantly less computational power. The name is a portmanteau of Point, from Point Clouds, and E, which stands for efficiency. Okay. But if it isn't 3D, then how is it making a 3D object? The team handling Point-E sidestepped the limitation by using Artificial Intelligence (AI). In conjunction with Point E, the team trained two AI models to convert the data into 3D meshes. The system utilizes two AI models - text-to-image and image-to-3D. The tool reads the text prompt and generates a rendered object that best fits the description. Once completed, the second AI model takes over, generating a Point Cloud based on the data, and converted into a 3D Mesh. The text-to-image model, was trained on labelled pictures to help it understand the correlation between words and visuals. Similarly, the image-to-3D model learned to use 3D objects and images to help it differentiate the two. Wait! Does that mean the tool threatens 3D artists? Not yet. As mentioned before, the output is not perfect. Point-E can sometimes fail to understand the image given by the text-to-image model, resulting in something that will not match the prompt. It's not big on details either and performs worse on evaluation with more advanced techniques. So why use Point-E? Because it's faster than anything else out there. Using Point E, the researchers said they could produce 3D models in two minutes using a single Nvidia V100 GPU. The V100 GPU launched in 2017, which means it can be faster on newer hardware. The researchers also expect Point-E to suffer from some teething problems, like biases inherited from image data sets or lack of safeguards, that might result in the creation of "dangerous objects". Okay. So, where can Point-E be used? The researchers pitch the tool as an alternative to 3D printing and can be used to fabricate real-world objects. 3D Object modelling, rendering and printing can take a long time, anywhere between several days and months. If the kinks are ironed out, Point-E could significantly speed up the process. As it improves, the researchers say it can find its way into video game development and animation workflows. It can be practical in fields where an accurate representation isn't required, like architectural previews or building and land proposals.

Moneycontrol News

