o1 coding capabilities

I review the performance of OpenAI's new model and demonstrate how it helped me develop a web application without any code changes

Open AI just announced their new model, GPT o1.

It is the advancement in large language models (LLMs), specifically designed to enhance reasoning capabilities. It addresses the challenge of enabling AI systems to perform complex reasoning tasks more effectively.

How does it work?

Advanced Training Techniques: o1 uses new training methodologies that focus on reasoning.
Chain-of-Thought Reasoning: Allowing the model to generate intermediate reasoning steps before arriving at a final answer.
Reinforcement Learning from Human Feedback (RLHF): The model is fine-tuned using feedback from human evaluators.
Dynamic Memory Mechanism: o1 incorporates a mechanism to remember and reference previous information within a conversation.

Performance

OpenAI have stated that it performs similar to PHD students:

Comparison with GPT4:

It clearly better at tasks that require reasoning, but for less reasoning such as writing, I GPT4 performs similar and quicker.

Developing a web application

I wanted to test this out with a platform I want to use as a wider project. To test this out, I wanted it to produce a solution that required multiple different technologies that communicated with each other. For example, JS for the front end, Python for the backend, and a library to display a neo4j graph on a web page. This would have taken me a while to do... so let's see.

Since my web application will be quite specific, I wanted to be clear with my prompt. I asked GPT4 to generate a prompt template:

Write a prompt template that will be sent to the o1 model. It is to create a web application that connects to an existing neo4j graph.

You are a developer tasked with building a full-stack web application that connects to an existing Neo4j graph database. The goal is to build a simple, intuitive interface that allows users to query and visualize data from the graph. The application should be interactive, enabling users to run custom Cypher queries and visualize the relationships between nodes and edges. Here are the details:

Requirements:

Technology Stack:
- Backend: Use Node.js (Express) or Python (Flask/Django) to handle API requests.
- Frontend: Use React.js (or any modern frontend framework) to create the user interface.
- Database: Connect to an existing Neo4j graph database using the Neo4j driver.
- Query Language: Cypher for interacting with the graph.
Features:
- User Authentication: Allow users to log in and manage their own sessions.
- Query Interface: Provide a simple input field for users to enter Cypher queries and display the results.
- Graph Visualization: Use a graph visualization library (e.g., D3.js, Sigma.js, or Neo4j's built-in visualization tools) to render nodes and relationships visually.
- Data Export: Allow users to export query results as CSV or JSON.
- Error Handling: Display user-friendly error messages if the query fails or the database connection is lost.
Backend API:
- Create REST or GraphQL endpoints that communicate with the Neo4j database.
- Handle user queries by sending them to the Neo4j instance and returning the results in JSON format.
- Handle pagination for large datasets.
- Implement proper security measures (e.g., authentication, rate limiting, and input validation).
Frontend:
- Build a responsive, intuitive UI using React.
- Display query results in both tabular and graph forms.
- Implement a feature to highlight or focus on specific nodes or relationships based on user input.
- Provide real-time feedback when the query is running or completed.
Deployment:
- Containerize the application using Docker.
- Ensure the app can be easily deployed to cloud platforms like AWS, Heroku, or DigitalOcean.
Neo4j Integration:
- Ensure proper handling of Neo4j sessions.
- Use environment variables to manage Neo4j credentials (username, password) and connection settings (e.g., URI).
Additional Requirements:
- Write clean, well-documented code.
- Ensure the application is performant and scalable for large datasets.

Additional Context:

The Neo4j database is already running and contains a dataset related to [insert context, e.g., social network, supply chain, etc.]. Your job is to build a web app that can explore and analyze this data efficiently.

Output the necessary code (backend, frontend) as well as instructions for deployment and installation.

I amended this and asked o1. It took about 15 seconds to come up with comprehensive documentation:

Ask o1 to write an application using the prompt

The table of contents...

It was able to produce code, project structure, and Dockerised environment for this application. I liked how it was able to connect the front end (React) and backend (Python / Flask). It selected react-force-graph-2d as the JS library to display the returned data. It even included rate limiting and input sanitisation. The only changes I had to make were the python library versions in requirements.txt and editing the .env variables with my neo4j credentials.

I amended the graph parameters a bit to make it more readable.

We can see how it's "thought" about the response:

Matrix path finding

I wanted to try and create a problem where o1 would have to "think" a little more.

Create a 10x10 binary matrix in the form of a 2d array. I want to use it for a python exercise where a student needs to find the path using a script.

matrix = [
    [1, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    [1, 1, 0, 1, 1, 0, 0, 1, 1, 0],
    [0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
    [0, 1, 1, 1, 0, 1, 1, 0, 1, 0],
    [0, 0, 0, 1, 0, 1, 0, 0, 1, 0],
    [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
    [0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
    [0, 1, 1, 1, 0, 1, 1, 1, 1, 1],
    [0, 1, 0, 1, 0, 0, 0, 0, 0, 1],
    [0, 1, 1, 1, 1, 1, 1, 1, 0, 1]
]

I want o1 to create a script to solve this and for o1 to give me the answer.

Write a prompt to solve this and provide the answer. (shortened)

Please complete the following two tasks:

Write a Python script that finds a valid path from the top-left corner (0,0) to the bottom-right corner (9,9) in the given 10x10 binary matrix below. The matrix uses 1 to represent a valid path and 0 to represent obstacles. Movement is allowed in four directions: up (U), down (D), left (L), and right (R). The script should output a valid path as a sequence of actions: U for up, D for down, L for left, and R for right.
Solve the problem yourself and provide the valid path you found, represented as a sequence of actions (U, D, L, R).

Here’s the matrix:

codematrix = [
    [1, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    [1, 1, 0, 1, 1, 0, 0, 1, 1, 0],
    [0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
    [0, 1, 1, 1, 0, 1, 1, 0, 1, 0],
    [0, 0, 0, 1, 0, 1, 0, 0, 1, 0],
    [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
    [0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
    [0, 1, 1, 1, 0, 1, 1, 1, 1, 1],
    [0, 1, 0, 1, 0, 0, 0, 0, 0, 1],
    [0, 1, 1, 1, 1, 1, 1, 1, 0, 1]
]

Please provide both the script and the path solution.

Here is the response:

Python solution ✅

from collections import deque

def find_path(codematrix):
    rows = len(codematrix)
    cols = len(codematrix[0])
    visited = [[False for _ in range(cols)] for _ in range(rows)]
    # Movement directions: Up, Down, Left, Right
    directions = [(-1, 0, 'U'), (1, 0, 'D'), (0, -1, 'L'), (0, 1, 'R')]
    queue = deque()
    # Starting point
    queue.append((0, 0, ''))  # (row, col, path_so_far)
    visited[0][0] = True

    while queue:
        row, col, path = queue.popleft()
        if row == rows - 1 and col == cols - 1:
            return path
        for dr, dc, action in directions:
            r_new, c_new = row + dr, col + dc
            if 0 <= r_new < rows and 0 <= c_new < cols:
                if codematrix[r_new][c_new] == 1 and not visited[r_new][c_new]:
                    visited[r_new][c_new] = True
                    queue.append((r_new, c_new, path + action))
    return None

codematrix = [
    [1, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    [1, 1, 0, 1, 1, 0, 0, 1, 1, 0],
    [0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
    [0, 1, 1, 1, 0, 1, 1, 0, 1, 0],
    [0, 0, 0, 1, 0, 1, 0, 0, 1, 0],
    [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
    [0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
    [0, 1, 1, 1, 0, 1, 1, 1, 1, 1],
    [0, 1, 0, 1, 0, 0, 0, 0, 0, 1],
    [0, 1, 1, 1, 1, 1, 1, 1, 0, 1]
]

path = find_path(codematrix)
if path:
    print("Path found:", path)
else:
    print("No path found.")

Path solution ❌

DDRRDDRDRDRDRRRRDDRRDD

I would have expected python script is good, as it's just implementing a path finding algorithm that should work with any solution, but the path it proposed is wrong. It's unclear how it got to this answer. I tried this in GPT 4o which returned a different, but wrong result.

PreviousThreat modelling generative AI NextMulti-agent adversarial AI systems

Last updated 9 months ago

Was this helpful?