Prompt Engineering Lessons Learned Link to heading

After months of working with GitHub Copilot and building LLM-powered applications since getting my license in February, I’ve accumulated a collection of techniques and insights that have dramatically improved my results. Prompt engineering is emerging as a critical skill, combining elements of software engineering, psychology, and linguistics; and it’s become essential for getting the most out of both coding assistants and API-based applications.

The Evolution of My Understanding Link to heading

When I first started using LLMs, I treated them like sophisticated search engines; ask a question, get an answer. This naive approach led to inconsistent results, hallucinations, and frustrating debugging sessions. Over time, I’ve learned that effective prompt engineering is about:

  • Clear communication: Being explicit about what you want
  • Context management: Providing the right information at the right time
  • Output formatting: Structuring responses for downstream use
  • Error handling: Planning for failure modes
  • Iterative refinement: Continuously improving based on results

Fundamental Principles Link to heading

Principle 1: Be Explicit and Specific Link to heading

Early attempts at prompts were often too vague:

Bad: "Write some code for a web app"

Good: "Write a Python Flask route that accepts POST requests
at /api/users, validates required fields (name, email),
stores the user in a PostgreSQL database, and returns
a JSON response with the created user ID or error details."

The specific version eliminates ambiguity and produces much more useful results.

Principle 2: Provide Examples (Few-Shot Learning) Link to heading

Examples are incredibly powerful for establishing patterns:

Prompt: "Convert these user queries to SQL WHERE clauses.

Examples:
Query: "Show me users who signed up last week"
SQL: WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 WEEK)

Query: "Find active premium subscribers"
SQL: WHERE status = 'active' AND plan_type = 'premium'

Query: "Users from Australia with more than 10 orders"
SQL: WHERE country = 'Australia' AND order_count > 10

Now convert this query:
Query: "Show me inactive users who haven't logged in for 30 days"
SQL:"

This pattern teaches the model the specific format and logic you want.

Principle 3: Use Structured Output Formats Link to heading

Requesting structured output makes responses much more useful:

Prompt: "Analyse this code for potential issues and format
your response as JSON:

{
  "issues": [
    {
      "severity": "high|medium|low",
      "type": "security|performance|maintainability|bug",
      "line": <line_number>,
      "description": "<issue_description>",
      "suggestion": "<suggested_fix>"
    }
  ],
  "overall_rating": "<1-10>",
  "summary": "<brief_overall_assessment>"
}

Code to analyse:
[code here]"

This ensures responses are machine-readable and consistently formatted.

Advanced Techniques Link to heading

Chain of Thought Prompting Link to heading

For complex reasoning tasks, asking the model to “think out loud” dramatically improves results:

Prompt: "Debug this JavaScript function that should calculate
compound interest but is returning incorrect results.

Please think through this step by step:
1. First, identify what the function should do mathematically
2. Then, trace through the code line by line
3. Identify any logical errors or bugs
4. Provide a corrected version with explanation

Function:
function compoundInterest(principal, rate, time, frequency) {
    return principal * Math.pow(1 + rate / frequency, frequency + time);
}

The step-by-step approach helps the model avoid jumping to conclusions and produces more reliable analysis.

Role-Based Prompting Link to heading

Establishing a clear role and context improves response quality:

Prompt: "You are a senior security engineer conducting a code review.
Your primary concerns are:
- Authentication and authorisation flaws
- Input validation vulnerabilities
- SQL injection risks
- XSS attack vectors
- Sensitive data exposure

Review this Express.js route and provide security feedback:

[code here]

Focus on actionable security improvements with specific examples."

Constitutional AI Patterns Link to heading

Building in self-correction and validation:

Prompt: "Generate a Python function to process user uploaded files.

After writing the function, review it for these security issues:
1. Path traversal attacks (e.g., ../../../etc/passwd)
2. File type validation bypass
3. Denial of service through large files
4. Executable file upload risks

If you find any issues, revise the function and explain the fixes."

This encourages the model to self-review and improve its initial response.

Domain-Specific Patterns Link to heading

Code Review Prompts Link to heading

System Message: "You are an expert code reviewer. Provide
constructive, specific feedback focusing on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance optimisations
- Security considerations
- Maintainability improvements

Always include specific examples and suggest concrete improvements."

User Message: "Please review this {language} code:

[code block with {language} syntax]
{code}
[end code block]

Consider the context: {context_description}"

API Documentation Generation Link to heading

Prompt: "Generate OpenAPI 3.0 documentation for this API endpoint.

Endpoint details:
- Method: {method}
- Path: {path}
- Description: {description}
- Parameters: {parameters}
- Request body: {request_body}
- Response: {response}

Include:
1. Complete OpenAPI specification
2. Request/response examples
3. Error responses (400, 401, 404, 500)
4. Parameter validation rules

Format as valid YAML."

Database Query Optimisation Link to heading

Prompt: "You are a database performance expert. Analyze this SQL query
for performance issues and suggest optimisations.

Query:
{sql_query}

Schema information:
{schema_details}

Consider:
1. Index usage and opportunities
2. Join optimisation
3. WHERE clause efficiency
4. Subquery vs JOIN trade-offs
5. Query execution plan analysis

Provide:
1. Performance analysis
2. Optimised query
3. Recommended indexes
4. Explanation of improvements"

Handling LLM Limitations Link to heading

Managing Hallucinations Link to heading

Prompt: "Based ONLY on the provided context, answer the question.
If the answer is not in the context, respond with
'I don't have enough information to answer that question.'

Context: {context}

Question: {question}

Important: Do not use external knowledge or make assumptions
beyond what's explicitly stated in the context."

Encouraging Uncertainty Link to heading

Prompt: "Analyze this code for potential bugs. For each issue you identify:

1. Rate your confidence (high/medium/low)
2. Explain why you think it might be a problem
3. Suggest how to verify if it's actually an issue

If you're uncertain about something, say so clearly.
It's better to express uncertainty than to give confident
but incorrect advice."

Version Control for Prompts Link to heading

I’ve developed a system for versioning and testing prompts:

class PromptTemplate:
    def __init__(self, name, version, template, variables, test_cases=None):
        self.name = name
        self.version = version
        self.template = template
        self.variables = variables
        self.test_cases = test_cases or []
        self.created_at = datetime.now()

    def render(self, **kwargs):
        missing_vars = set(self.variables) - set(kwargs.keys())
        if missing_vars:
            raise ValueError(f"Missing variables: {missing_vars}")

        return self.template.format(**kwargs)

    def test(self, llm):
        """Run test cases against this prompt"""
        results = []
        for test_case in self.test_cases:
            prompt = self.render(**test_case['input'])
            response = llm(prompt)

            results.append({
                'input': test_case['input'],
                'expected': test_case.get('expected'),
                'actual': response,
                'passed': self._evaluate_response(response, test_case)
            })

        return results

# Example usage
code_review_prompt = PromptTemplate(
    name="code_review",
    version="1.2.0",
    template="""
    Review this {language} code for:
    - Security issues
    - Performance problems
    - Code quality concerns

    Code:
    {code}

    Provide specific, actionable feedback.
    """,
    variables=["language", "code"],
    test_cases=[
        {
            'input': {
                'language': 'Python',
                'code': 'password = input("Password: ")\nprint(f"Your password is: {password}")'
            },
            'expected_issues': ['password_exposure', 'logging_sensitive_data']
        }
    ]
)

Prompt Engineering for Different Models Link to heading

GPT-4 vs GPT-3.5 Turbo Link to heading

GPT-4 handles more complex instructions and nuanced requests:

# GPT-4 can handle this complex, multi-step prompt
"Analyse this codebase, create a refactoring plan, prioritise
the changes by impact and effort, and generate the first
three refactoring steps with complete implementations."

# GPT-3.5 Turbo works better with simpler, focused prompts
"Review this function for bugs and suggest one specific improvement."

Model-Specific Optimisations Link to heading

def get_prompt_for_model(base_prompt, model_name):
    """Adapt prompts for different models"""

    if model_name.startswith("gpt-4"):
        # GPT-4 can handle more complex instructions
        return f"""
        {base_prompt}

        Please provide detailed reasoning for your recommendations
        and consider multiple approaches before settling on your final answer.
        """

    elif model_name.startswith("gpt-3.5"):
        # GPT-3.5 benefits from more explicit structure
        return f"""
        {base_prompt}

        Format your response as:
        1. Summary
        2. Key points
        3. Recommendations
        """

    else:
        return base_prompt

Debugging Prompt Issues Link to heading

Common Problems and Solutions Link to heading

Problem: Inconsistent output format

# Bad
"Explain the pros and cons"

# Good
"List exactly 3 pros and 3 cons in this format:
Pros:
1. [pro 1]
2. [pro 2]
3. [pro 3]

Cons:
1. [con 1]
2. [con 2]
3. [con 3]"

Problem: Model ignoring instructions

# Solution: Repeat critical instructions
"Generate a Python function that validates email addresses.

IMPORTANT: Use only standard library modules - do not import
external libraries like regex or email-validator.

The function should return True for valid emails, False otherwise.

Remember: Use only Python standard library modules."

Problem: Hallucinated code libraries

# Solution: Be explicit about constraints
"Write a JavaScript function using only vanilla JavaScript.
Do not use any external libraries, frameworks, or modules
that require npm installation. Use only built-in JavaScript
features available in modern browsers."

Performance Optimisation Link to heading

Token Efficiency Link to heading

def optimise_prompt_tokens(prompt):
    """Optimise prompt for token efficiency"""

    # Remove unnecessary whitespace
    optimised = re.sub(r'\s+', ' ', prompt.strip())

    # Use abbreviations for common terms
    replacements = {
        'please': 'pls',
        'function': 'fn',
        'variable': 'var',
        'parameter': 'param'
    }

    for full, abbrev in replacements.items():
        optimised = optimised.replace(full, abbrev)

    return optimised

Batch Processing Link to heading

def batch_prompts(prompts, batch_sise=10):
    """Combine multiple prompts into batches"""

    batched_prompt = "Process these requests:\n\n"

    for i, prompt in enumerate(prompts[:batch_sise]):
        batched_prompt += f"Request {i+1}:\n{prompt}\n\n"

    batched_prompt += """
    Respond with:
    Response 1: [answer to request 1]
    Response 2: [answer to request 2]
    ...etc
    """

    return batched_prompt

Testing and Validation Link to heading

Automated Prompt Testing Link to heading

import json
from dataclasses import dataclass
from typing import List, Dict, Any

@dataclass
class PromptTest:
    name: str
    prompt: str
    expected_output: Dict[str, Any]
    validation_rules: List[str]

def validate_response(response: str, test: PromptTest) -> Dict[str, Any]:
    """Validate LLM response against test criteria"""

    results = {
        'passed': True,
        'issues': [],
        'score': 0
    }

    for rule in test.validation_rules:
        if rule == 'contains_code_block':
            if '```' not in response:
                results['issues'].append('Missing code block formatting')
                results['passed'] = False

        elif rule == 'valid_json':
            try:
                json.loads(response)
                results['score'] += 1
            except json.JSONDecodeError:
                results['issues'].append('Invalid JSON format')
                results['passed'] = False

        elif rule.startswith('contains:'):
            expected_text = rule.split(':', 1)[1]
            if expected_text not in response:
                results['issues'].append(f'Missing expected text: {expected_text}')
                results['passed'] = False

        elif rule == 'no_hallucination':
            # Custom validation logic for hallucination detection
            if detect_hallucination(response, test.prompt):
                results['issues'].append('Potential hallucination detected')
                results['passed'] = False

    return results

# Example test suite
test_suite = [
    PromptTest(
        name="code_generation",
        prompt="Generate a Python function that reverses a string",
        expected_output={'type': 'function', 'language': 'python'},
        validation_rules=['contains_code_block', 'contains:def ', 'contains:return']
    ),
    PromptTest(
        name="json_response",
        prompt="Analyze this code and return results as JSON",
        expected_output={'format': 'json'},
        validation_rules=['valid_json', 'contains:issues']
    )
]

Industry-Specific Applications Link to heading

System Message: "You are a legal document analyst. Focus on:
- Contract terms and obligations
- Potential risks or liabilities
- Missing standard clauses
- Ambiguous language that could cause disputes

Always indicate confidence level and recommend legal review for important decisions."

User Message: "Analyze this contract section for potential issues:
{contract_text}

Focus on: {specific_concerns}"

Financial Risk Assessment Link to heading

Prompt: "You are a financial risk analyst. Evaluate this investment proposal.

Proposal: {proposal_text}

Analyze:
1. Market risks (score 1-10)
2. Operational risks (score 1-10)
3. Financial risks (score 1-10)
4. Regulatory risks (score 1-10)

For each risk category:
- Identify specific risk factors
- Assess likelihood and impact
- Suggest mitigation strategies

Provide overall risk rating and investment recommendation."

Key Lessons Learned Link to heading

  1. Specificity beats generality: Detailed prompts produce better results than vague ones

  2. Examples are powerful: Few-shot learning dramatically improves output quality

  3. Structure is crucial: Requesting formatted output makes responses much more useful

  4. Iteration is essential: Good prompts are developed through testing and refinement

  5. Context matters: Providing relevant background information improves accuracy

  6. Validation is necessary: Always verify LLM outputs, especially for critical applications

  7. Model differences are significant: Tailor prompts to specific model capabilities

  8. Prompt engineering is programming: Treat prompts as code with versioning, testing, and documentation

Prompt engineering has become a core skill for working effectively with LLMs. Like any programming discipline, it benefits from systematic approaches, best practices, and continuous learning. The investment in developing these skills pays dividends in more reliable, useful, and cost-effective AI applications.


What prompt engineering techniques have you found most effective? How do you approach testing and iterating on prompts for your applications?