AI for AWS Documentation

sovietmudkipz · on July 6, 2023

This is exactly the kind of software that people should be making. Even if this one is bad, the concept is very sound.

I want to have a specialist AI that is trained to help me learn how to use the software. 100% what should be happening.

General AI should know how to do stuff too but having an specialist AI implies that the company/group of people/person is making sure to tune the model.

Just an IMO.

raincole · on July 7, 2023

MDN implemented an "AI explains" feature, and the reception is extremely negative.

https://github.com/mdn/yari/issues/9208

fragmede · on July 7, 2023

Is it? It's true there are a lot of very loud messages on that linked issue, but only MDN knows how many people are using and engaging with their system compared to on that thread. We, as the general public, don't know anything other than there's a vocal group of dissidents, and what MDN tells us. Which I looked and didn't find stats on its usage.

scarface_74 · on July 6, 2023

Why?

You can ask the same questions to ChatGPT and get the same or better answers.

I also know from personal experience with ChatGPT, that you can use it to:

- convert Python/boto3 to any language that has an AWS SDK

- convert CloudFormation to Terraform or the CDK

- write scripts that use the SDK

You will get the occasional hallucination.

KRAKRISMOTT · on July 6, 2023

ChatGPT is not up to date unless you start using the plugins. This sort of indexing is based on vector databases and various intermediate prompting. If you want to get technical, the academic term is "Retrieval Augmented Generation".

fzliu · on July 6, 2023

Hallucination is unfortunately inevitable when it comes to any autoregressive model, even with RAG. You can minimize hallucination by prompting, but you'll still see some factually incorrect responses here and there (https://zilliz.com/blog/ChatGPT-VectorDB-Prompt-as-code).

I unfortunately don't think we'll be able to solve hallucination anytime soon. Maybe with the successor to the transformer architecture?

Simorgh · on July 7, 2023

Hallucination is naturally a concern for anyone looking to depend upon LLM-generated answers.

We’ve been testing LLM responses with a CLI, we’re using it to generate accuracy statistics, which is especially useful when the use-case Q/A is limited.

If ‘confidence’ can be returned to the user, then at least they can have an indication if there is a higher quality-risk with a given response.

cle · on July 6, 2023

Here's one example question that ChatGPT utterly fails at, but that this answers fine: "What is Amazon CodeCatalyst?"

ChatGPT: "I'm sorry, but as of my knowledge cut-off in September 2021, there was no service, tool, or product known as Amazon CodeCatalyst offered by Amazon Web Services (AWS). [...]"

jacurtis · on July 7, 2023

A lot of stuff in AWS has changed since Q3 2021.

joshspankit · on July 7, 2023

It feels like the AWS vendor lock-in strategy is constant change.

politician · on July 6, 2023

Ideally, specialist models are small enough to run locally.

shaunpud · on July 7, 2023

I was looking at the CloudLinux[1] docs the other day and noticed an "Ask GPT?" feature which I found quite helpful.

[1] https://docs.cloudlinux.com

CSDude · on July 6, 2023

Well, I'll pass.

Prompt: how to connect a lambda function to an RDS instance and query it safely?

1. Configure your RDS instance for outbound connections to AWS Lambda. This step involves ensuring that your DB instance is in a public subnet on your VPC and that the instance's "PubliclyAccessible" property is set to true.

klysm · on July 6, 2023

Lol yeah no thanks. This is one of things that makes me nervous about using LLMs. If the internet contains shitty solutions for solving a problem, it’s going to recommend shitty solutions. Inexperienced folks won’t know better.

There’s also the thing where that’s a positive feedback look and we overflow the worlds corpus with garbage.

WrtCdEvrydy · on July 6, 2023

> Inexperienced folks won’t know better.

I've seen answers like this in Stack Overflow before. Technically correct, but basically broken in every other way: cybersecurity's future is gonna be interesting.

scarface_74 · on July 6, 2023

One of my pet peeves about almost every answer on Stack Overflow is that the answer always starts with

    client=boto3.client(<service>, access-key,secret-key)

There is never a reason to include those last two parameters. It’s always a security risk no matter how you do it.

jacurtis · on July 7, 2023

For that matter, using an access-key or secret-key at all is basically an anti-pattern now.

In the past few months AWS has added to their UI a popup box that asks you why you are creating access keys, it makes you answer it before generating the keys. Once you give it the answer it tells you better ways to do almost anything you're trying to do, and you have to agree to ignore those suggestions one more time before actually generating the keys.

Basically the standard access keys are really of no use anymore. You should be using roles, OIDC, IAM Identity Center, cognito federated identities, or something else. There are a million ways to identify yourself that are more secure than that and I wish tutorials stopped assuming that you should generate access keys because in no practical 2023 use case should you be doing that anymore. Especially not in any sort of sensitive or corporate environment.

poxrud · on July 7, 2023

It’s still the fastest and most common way to get aws cli working on a local dev machine.

scarface_74 · on July 7, 2023

If you’re part of any decently large organization, you’re probably using Control Tower or another solution that lets you create temporary credentials easily.

klysm · on July 7, 2023

I really do appreciate the effort they’ve gone through there though - they didn’t have to.

esquire_900 · on July 7, 2023

I think they do. Given the damage that leaked keys can do (especially in the Bitcoin mining area) it must be an expensive problem for them.

klysm · on July 6, 2023

This is great example, and I think it’s a smaller version of the same problem.

donmcronald · on July 7, 2023

> If the internet contains shitty solutions for solving a problem, it’s going to recommend shitty solutions.

Have you ever noticed how the dumbest people tend be be the loudest, most confident, and most eager to share their opinions? Those are the voices that trained ChatGPT.

It’s a hard pass for me, but I fear I’m going to be forced to used it because the dumbest half of humanity will view it as a peer and use it for everything.

scarface_74 · on July 6, 2023

Asking the same question to Chatgpt

To connect a Lambda function to an RDS (Relational Database Service) instance and query it safely, you can follow these general steps:

1. Configure your Lambda function's execution role: Ensure that your Lambda function has the necessary permissions to access the RDS instance. The role should include the `AWSLambdaVPCAccessExecutionRole` and the appropriate RDS permissions.

2. Place your Lambda function and RDS instance in the same VPC: Ensure that both the Lambda function and the RDS instance are in the same Virtual Private Cloud (VPC) to enable their communication. You can configure the VPC settings in the AWS Management Console.

3. Configure the security group settings: Set up the security groups for your Lambda function and RDS instance to allow the necessary inbound and outbound connections. The Lambda function's security group should allow outbound access to the RDS instance's security group on the RDS port (usually port 3306 for MySQL or 5432 for PostgreSQL).

4. Create a subnet group: If your RDS instance is in a private subnet, create a subnet group that includes the appropriate subnets for your Lambda function to access the RDS instance securely.

5. Connect to the RDS instance from the Lambda function: In your Lambda function code, establish a database connection using the appropriate database client library, such as pymysql for MySQL or psycopg2 for PostgreSQL. Use the connection details (endpoint, username, password, etc.) of your RDS instance.

6. Execute queries safely: To query the RDS instance safely, ensure that your Lambda function's code incorporates secure coding practices. Use parameterized queries or prepared statements to prevent SQL injection attacks. Avoid storing sensitive information (such as database credentials) in your Lambda function code and instead use environment variables or AWS Secrets Manager for secure storage and retrieval.

Remember to regularly update and patch your Lambda function, RDS instance, and associated dependencies to maintain security and stay protected against potential vulnerabilities.

It's worth noting that the exact implementation details may vary depending on your specific use case, programming language, and AWS setup. Consulting the AWS documentation and resources related to Lambda, RDS, and VPC networking can provide more detailed and up-to-date instructions for your specific requirements.

8note · on July 6, 2023

This all looks about right, but how would somebody without aws experience be able to validate it without making a mess?

jacurtis · on July 7, 2023

This to me is why I don't use ChatGPT. Every time I use it I get answers that are questionable in validity. I'd say that 3/4 of my questions have returned incorrect answers at least partially. Maybe 10% are wildly incorrect with the remainder being partially incorrect.

So if it is wrong with MOST of the questions that I am able to validate myself, then how can I trust it on the questions that I am unable to validate myself.

The whole reason I started even doing this passive research is because I had an employee late last year who used to be a decent employee (he was never an all-star, but he got the job done to an average/satisfactory level), and started to all of a sudden perform incredibly poorly.

He was submitting code and solving problems that were just really bad. He was always just an average producer, and was always teetering on the edge during performance reviews, doing just well enough that we kept him around. But he quickly started to make mistake after mistake. Several code reviews I found really strange artifacts and comments in his code that were blatant mistakes. I confronted him about them over the course of several performance reviews and said he blamed stackoverflow "copy pasta". Eventually I actually fired him, since it was getting to a point that almost everything he submitted or produced was problematic in some way and he was burning more of my time than he was saving. So I ended up firing him.

While firing him, in front of HR he finally broke down and admitted that he has been using ChatGPT for everything and he begged us to let him stay and he would stop using it altogether. I of course didn't care at that point and we let him go. But I started to realize the increase of mistakes were all due to ChatGPT leading him astray.

That whole experiment really taught me that ChatGPT is not ready for primetime. If you blindly trusted ChatGPT you will find yourself in the wrong place most of the time. The problem is that unless you already know the answer to the question you are asking, it is very difficult to tell where chatgpt's answers might be correct and were they might be incorrect (because it is usually a mixture of both). This makes it entirely useless for asking questions that you are not comfortable validating.

fragmede · on July 7, 2023

Presumably you're a good manager but the way you told it makes you sound like a bad manager that didn't dig into what's going on with a poor performing employee and let them flail around until you had to fire them. There's more to that story and it's not really about ChatGPT at all.

There are other stories out there, like https://hyperbo.la/w/chatgpt-4000/ which shows it can useful and a force multiplier when used well, but it's like giving a faster car to a bad driver. It'll just result in them crashing faster. If you've got a programmer that doesn't want to program, ChatGPT can't help them be a better programmer since they don't actually want to be one!

Aeolun · on July 7, 2023

Why’d he be so reluctant to confess to using ChatGPT? And why would he continue using it despite constant feedback that his performance was getting so much worse?

tinix · on July 7, 2023

It's like gambling for some people... No self control.

scarface_74 · on July 7, 2023

> So if it is wrong with MOST of the questions that I am able to validate myself, then how can I trust it on the questions that I am unable to validate myself.

If you’re using it to generate code, you can validate it yourself - run the code.

taberiand · on July 7, 2023

ChatGPT doesn't need to provide perfect AWS solutions for people without AWS experience; the onus is on the human to do the work necessary on their side to be sure of the solution it produces.

It should be thought of as a time saving tool for experts (and people willing to put the work in), not a magic button for lazy people.

scarface_74 · on July 7, 2023

One miss I see is this

> and the appropriate RDS permissions.

The role doesn’t need any IAM RDS permissions. It needs network access. But then all of the access to the database is controlled by your standard database permissions - not IAM.

I guess there are corner cases like using IAM permissions to connect to your database which can be done. But is not typical

avereveard · on July 7, 2023

It still isn't tho, this will work for testing, maybe, but each lambda will be it's own connection and that will exhaust db resources real fast, you're supposed to have a pooling proxy between lambdas and RDS.

scarface_74 · on July 7, 2023

I asked ChatGPT the same question again and then followed up with:

“Is there a more efficient way to connect to the database”

It then went off the rails. It gave me generic answers about using connection pooling that your language framework provides.

Of course that doesn’t help with each Lambda being invoked separately.

I then asked “But each Lambda runs based on its own vm”

It then correctly said that while each Lambda invocation would have its own pool and couldn’t be shared, it could be shared with multiple connections during one invocation.

Which is technically correct. But not the answer I was looking for.

In all fairness, it’s the same type of answers I would expect from a junior to mid developer with limited experience with AWS. I would hire a person who could answer that well.

BoorishBears · on July 7, 2023

Any time you ask ChatGPT a technical question on something you're not familiar with it should be three parts:

- What are the considerations?

- Implement it with the considerations

- Did we forget anything?

It understands the need for a proxy from step 1:

https://chat.openai.com/share/7ca37130-a771-457c-8742-a5f941...

scarface_74 · on July 7, 2023

That technique didn’t work when I asked it to create a Python script to return all IAM roles with a certain set of policies attached.

It still missed using a paginator to handle the list_roles call returning more than 50 roles.

Once I pointed it out, it did add pagination support.

BoorishBears · on July 7, 2023

I try not to cheat and hint at specific issues (since that relies on prior knowledge)

I'd be surprised if even after the last prompt it wouldn't notice that. Saying "Did we miss anything" leaves it open it to re-evaluate both the implementation and the original considerations

Edit: There's some non-determinism involved, but GPT-4 caught the pagination from planning stage here: https://chat.openai.com/share/3c356d4f-15d4-4f6e-bd29-af6a0b...

scarface_74 · on July 9, 2023

Funny enough, even in my original question, it put pagination as a consideration. But it still didn’t include it.

re5i5tor · on July 7, 2023

I do short quote-posts on my blog and wrote this one up.

re5i5tor · on July 7, 2023

Yes! Excellent approach, capturing.

CSDude · on July 7, 2023

And it's in the docs, RDS Proxy, launched in 2020, before GPT training cut-off

https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-rd...

avereveard · on July 9, 2023

Cut off date shouldn't matter for RAG

Aeolun · on July 7, 2023

Depends, if every lambda creates a single connection you can get to pretty high numbers before it becomes an issue.

scarface_74 · on July 6, 2023

Keep going down the rabbit hole…

“I’m a beginner. Walk me through step $n”

scrum-treats · on July 7, 2023

Same as pre-GPT: Labs. You still have to practice implementation (for now).

vasili111 · on July 6, 2023

I find the links from where AI got information more useful than answer of AI.

Jianghong94 · on July 6, 2023

AWS has some great documentations

underlines · on July 6, 2023

RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:

- Chunking can interfer with context boundaries

- Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)

- Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)

- RAG will miserably fail with requests like "summarize the whole document"

- to my knowledge, openAI embeddings aren't performing well, use a embedding that is optimized for question answering or information retrieval and supports multi language. SOTA textual embedding models can be found on the MTEB Leaderboard [2]. Also look into instructorEmbeddings

- the LLM used for the Q&A using your context should be fine-tuned for this task. There are several open (source?) LLMs based on openllama and others, that are fine tuned for information retrieval. They hallucinate less and are sticking to the context given.

1 https://github.com/underlines/awesome-marketing-datascience/...

2 https://github.com/embeddings-benchmark/mteb

darkteflon · on July 6, 2023

This comment was very helpful for me, thanks.

I’ve been working with RAG for months, too, and it’s vanishingly rare to see anything but toy examples in the wild. This is a solid, concise list of where the dragons are.

Any idea where all the RAG practitioners hang out and trade war stories? Is there a forum or Discord or something?

tomhamer · on July 7, 2023

Marqo provides automatic, configurable chunking (for example with overlap) and can allow you to bring your own model or choose from a wide range of opensource models. I think e5-large would be a good one to try. https://github.com/marqo-ai/marqo

d4rkp4ttern · on July 7, 2023

Using Qdrant doesn’t require docker, like Marqo does (from the README). Any trade offs between the two? Doc chunking is an independent functionality and there are already line that help chunk with overlap etc,and also it’s not hard to roll your own.

avereveard · on July 7, 2023

To solve the question embedding issue I prefer another approach, you store document embeddings as normal, then for searching you let LLM hallucinate an answer and use the wrongish answer embedding to do the vector search.

underlines · on July 7, 2023

The inverse idea of Hypothetical Embeddings, HyDE [1] "HyDE is an embedding technique that takes queries, generates a hypothetical answer, and then embeds that generated document and uses that as the final example."

BriefGPT [2] is implementing this and it uses the following prompt at ingestion-time:

"Given the user's question, please generate a response that mimics the exact format in which the relevant information would appear within a document, even if the information does not exist. The response should not offer explanations, context, or commentary, but should emulate the precise structure in which the answer would be found in a hypothetical document. Factuality is not important, the priority is the hypothetical structure of the excerpt. Use made-up facts to emulate the structure. For example, if the user question is "who are the authors?", the response should be something like 'Authors: John Smith, Jane Doe, and Bob Jones' The user's question is:"

1 https://python.langchain.com/docs/modules/chains/additional/...

2 https://github.com/e-johnstonn/BriefGPT

d4rkp4ttern · on July 7, 2023

That’s an interesting list (awesome-marketing-data science). Curious what is specific to marketing in that list, or maybe it’s just an inaccurate name.

underlines · on July 7, 2023

historical repo name, it's really not that anymore, besides a very old list of marketing stuff that i rarely update. I should rename the repo, but I hesitate :)

zim1 · on July 7, 2023

Have you found any particularly promising approach towards "hypothetical embeddings" that you can recommend?

cypress66 · on July 7, 2023

What context size is used? Can you "brute force" your way by using let's say 16k or 32k of context?

jtokoph · on July 6, 2023

Prompt: What is glacier?

Glacier is a term that is not directly mentioned in the provided sources.

Prompt: What is a glacier?

A glacier is a large mass of ice that moves slowly over time due to the accumulation of snow, ice, and other forms of frozen precipitation.

Seems like it’s just using a general model?

serjester · on July 6, 2023

It's definitely just using standard semantic search (otherwise you wouldn't be getting links). "What is glacier storage" gives you a great response.

redox99 · on July 6, 2023

If they finetuned the model on AWS docs, would the embeddings, and thus the vector search improve?

alsima · on July 6, 2023

Most likely, the model would be less inclined to answer questions/hallucinate for prompts not related to AWS—this is definitely be a future path for improvement

manojlds · on July 6, 2023

What is Glacier works for me but What is a Glacier doesn't.

bjt · on July 6, 2023

It invented an answer for something that AWS actually can't do right now.

https://imgur.com/a/0IYZ2WV

jrvarela56 · on July 6, 2023

phind.com did not halucinate in this case: https://capture.dropbox.com/4gIUDuAxr14bnNIt

https://www.phind.com/search?cache=d0b3a85b-17f9-4def-b8d0-b...

deanCommie · on July 7, 2023

That is basically status quo for LLMs right now.

They're all so eager to please they will basically never say "that's actually not possible", and invent some plausibly sounding bullshit.

For beginners who are struggling to tell the difference between 1/ how to do something that's possible, 2/ how to do something that SHOULD be possible but just currently isn't, and 3/ how to do something that is RIDICULOUS and shouldn't even ever be possible, ChatGPT is worse than nothing :(

macksd · on July 6, 2023

Maybe it could help them when designing APIs for new products to be consistent with previous design choices. But perhaps it's too late for that.

JimtheCoder · on July 6, 2023

Are you allowed to use the AWS logo on a site that is not owned by Amazon?

I originally thought this was an official Amazon website...their lawyers would probably say the same thing...

QuinnyPig · on July 6, 2023

“Allowed” is a funny thing.

I launched “Last Week in AWS” with AWS in the domain name seven years ago. AWS has never made an issue of it, though they obviously have that option.

I also have the option (and ownership) to migrate to “Last Week in the Cloud” and talk about their competitors, so it’s likely everyone is happier this way—but I confess to not kicking the bear hard enough to find out.

scarface_74 · on July 6, 2023

It’s probably because no one at AWS has heard of your little podcast or website /s

I’m sure you know that your name is brought up frequently inside AWS.

jborden13 · on July 6, 2023

I thought I heard openai was sending cease and desists for *gpt domain names as well

nextworddev · on July 6, 2023

Definitely not

mkl95 · on July 6, 2023

The thought that a bunch of people will trust this tool and make some terrible decisions is unsettling. On the other hand it could be pretty powerful if you know what you are doing.

stan_kirdey · on July 6, 2023

I am building something similar, it has documentation from azure, aws, and lots of slack/discord threads of software projects are also searchable, check it out https://www.kwq.ai

it even gets real time indexing from slack of aws deep java library, and from discord of deepset haystack project

mjhay · on July 7, 2023

It can't be worse than Amazon's own AWS documentation.

jacurtis · on July 7, 2023

To be fair to AWS, their documentation has improved dramatically the past 24 months or so. Not perfect, but the difference between 2019 and 2023 is dramatic.

Also, I'd argue that it is very easy to be worse than AWS's own docs. A chatbot that hallucinates inaccurate answers, while sounding plausible and confident is far worse than documentation that is 100% accurate, but lacking in some areas.

I'd rather receive an answer of "I don't know" or "it isn't documented" than an explanation that seems legit, so I spend 15 hours building a solution in accordance to a LLM's response and then find out that its not possible or real the way it was explained by the LLM.

jamestimmins · on July 6, 2023

What's the best current way to make a bunch of documents searchable via LLMs like this?

I've tried the using OpenAI w embeddings (iirc), but this was slow, got expensive quickly, and it struggled to answer questions about the text accurately. Curious if there's better standard approaches now.

Jianghong94 · on July 6, 2023

A couple of things come to mind: 1. embedding methods: there're a couple of ways to do that, the most used one is OpenAi's text-davinci-002, although in my use case(short sentence description of API) it didn't work pretty well; 2. how you truncate documents into pieces: for this langchain has some implementation and helpful pointers.

I think you have to do lots of experiment on this till you find your best information retrieval strategy

linguistbreaker · on July 6, 2023

I just came across this project which seems to be aiming at streamlining exactly that :

https://github.com/Mintplex-Labs/anything-llm

JimmyRuska · on July 6, 2023

You could copy the steps listed in alpaca-lora with your own data. Try thebloke's superhot ggml models. Train it in lambda labs or replicate

ec109685 · on July 7, 2023

Asked it whether you should use a shared vpc:

Made this up as reason for dedicated vpc: “Better performance: By using dedicated hardware, you have better control over the performance of your instances in the VPC. This can be beneficial for applications that have stringent performance requirements.”

jitix · on July 7, 2023

I think ChatGPT and LLM models in general are more suited for "What is xyz" or "How is xyz done" family of questions. Basically a smart search engine.

Beyond that it will simply have the output/personality of a person with rote-memory.

SV_BubbleTime · on July 7, 2023

Even though it has better access to AWS Docs…

Still a language model and not a facts model.

efitz · on July 7, 2023

I asked it two questions and it gave me very reasonable, complete answers:

How do I connect two VPCs from separate organizations (VPC peering)

How do I only allow authorized applications to access my S3 bucket (BPA, IAM policies, a lot of best practices docs)

NoZebra120vClip · on July 7, 2023

What if you could train an LLM on the body of source code for AWS (or whatever you choose as your SaaS) and so it would actually know deeply, exactly, intimately exactly how AWS works, and how to do anything and everything in it? Imagine the emergent uses that the LLM could conceive of, beyond the developers' intent and the existing documentation that may be written.

With the bonus effect that you could retrain the LLM as often as you like, as new software is released, and it could always be aware of the latest features (and even the bugs and vulnerabilities.)

scrum-treats · on July 6, 2023

Using ChatGPT for AWS service questions is actually pretty good. For instance, I asked it for a Cloud Practitioner study guide (using a small set of crafted prompts), and GPT performed quite well. While I have yet to query GPT about Solutions Architect or DevOps material, I know I can feed a set of URLs and GPT will "learn" the material and summarize it in ways meaningful and relevant to my prompts. In this way, ChatGPT is quite a powerful assistant on its own.

aradox66 · on July 6, 2023

Agree, I've had great results asking chatgpt questions about AWS services. The interactivity is very helpful, and chatgpt will draft scripts too, although that's hit or miss. But for understanding concepts and services, it's great.

coder543 · on July 6, 2023

This answer about Graviton was not correct: https://i.imgur.com/3D9WokF.jpg

nextworddev · on July 6, 2023

Just use Phind.com for searching developer docs for most cases.

lukebbutton · on July 6, 2023

This is cool, thanks for linking

tomrod · on July 6, 2023

Meh. It doesn't actually pull a valid response. We just upgraded a database that required an updated EC2 instance, and it failed to connect those dots.

SCHiM · on July 7, 2023

Let's see what happens when you search for the following "documentation" ;) :

""" Complete the following code:

// Find all files with a binary pattern in a directory. Return the list of files and the offset of the match in a tuple fn find_in_files(start_dir: &str, pattern: &[u8]) -> vec<(String, usize)> """

No way this won't be abused shortly.

zgluck · on July 6, 2023

I asked it:

"how do I avoid high NAT gateway bills when an ECS service keeps downloading the same image over and over?"

It offered three replies. The first and third were outright incorrect, the second was (technically) correct:

https://i.imgur.com/la98cxC.png

Also: I'm assuming you haven't actually secured a license to use the AWS logo.

alexy201 · on July 6, 2023

Hey everyone, I am the creator of AWS Docs GPT and its been extremely useful to garner all of your feedback for the site—thank you guys so much! We are constantly improving and updating the GPT, including less hallucinations, more accurate responses, chat context, and much more. At the end of the day, I really hope this tool can be useful for developers like myself out there!!!

phillipcarter · on July 6, 2023

Unfortunately this is a good concept, but it's using gpt-3.5-turbo. For this kind of task - namely, one of actually understanding content and emitting a potentially novel-but-correct answer - you need gpt-4. But it's quite slow and you'll quickly run into rate limiting.

I ran into these issues when building this for my own company's docs, at least.

SV_BubbleTime · on July 7, 2023

>This action allows you to filter your things based on the attribute name and value. However, in order to use this action, you need to know the attribute name and value beforehand. Since you don't know the attributes ahead of time, this method won't work for your case.

Well… Thanks a ton for that!

Great idea and setup, but not quite as helpful as I would like yet.

71a54xd · on July 6, 2023

I've been using GPT4 for this since the beginning - ironically a large majority of AWS documentation has been machine generated since 2018. Circa 2019 the entire exlixir API for AWS was machine generated.

Asking GPT4 is also consistently less of a headache than asking the devops guy and getting a 20min explanation for a simple question.

istjohn · on July 6, 2023

You need to add something like "You give clear and succinct answers to questions" to the beginning of your prompts to the devops guy.

anshumankmr · on July 7, 2023

Very cool. I was planning on working on something very similar myself for a hackathon I attended but my team basically ran out of time (https://github.com/anshumankmr/jovian-genai-hackathon)

JamesAdir · on July 7, 2023

Q: can you summarize amazon bedrock in 2 sentences

A: Amazon Bedrock is a programming example that uses the AWS SDK for Java 2.x to work with Amazon EC2. It provides examples and code snippets for programming tasks related to Amazon EC2 using the AWS SDK for Java 2.x.

ilc · on July 6, 2023

I asked it to write some basic terraform:

- Make a VPC. - Add an Instance. - Abstract the region and AZ, into vars.

etc... every time I wanted to change the code, I asked the bot to do the refactor, and it did.

Overall, I'm impressed. It wasn't the most complicated thing, but it didn't dive off the deep end.

zoomzoom · on July 6, 2023

We did something similar for all the cloud whitepapers from AWS, Azure, GCP, CloudFlare, and CNCF at https://cloudwhitepapers.withcoherence.com/

These are fun projects!

JimmyRuska · on July 6, 2023

You might have better outcomes with the API docs and official documentation. Whitepapers tend to have a lot of marketing, may not age gracefully.

zoomzoom · on July 6, 2023

Thanks for the perspective!

victor106 · on July 6, 2023

This looks cool.

How does this work?

zoomzoom · on July 6, 2023

We actually open sourced the project - see here https://www.withcoherence.com/post/create-your-langchain-and...

JimmyRuska · on July 6, 2023

I wonder if people will make DSLs specifically for LLMs.

For example the terseness / symbols of APL, Perl, or event set notation.

LLMs could train and output the shorter symbolic notation, and it could be expanded for human readability by another program at export.

fragmede · on July 7, 2023

that's just prompt engineering by another name

koalkools · on July 7, 2023

The best part is the documentation sources, in fact I'd prefer the only feature being "where are the docs on <stupid specific thing I need>" instead of trying to synthesize an answer for me

social_quotient · on July 6, 2023

It seems to know more than just AWS, I was thinking it was just embeddings but then I asked it “Is Aws better than azure?” And it seemed to give an answer which widener seem to be derived from the source documents.

yayitswei · on July 6, 2023

Nice work! Would be even more useful to be able to have a conversation with it.

alsima · on July 6, 2023

We're in the process of doing just that and adding chat context/basically remembering your past questions.

fzliu · on July 6, 2023

We built something very similar, except for other open-source projects rather than AWS documentation: https://osschat.io

Aeolun · on July 7, 2023

It doesn’t distinguish between HTTP api gateway and REST api gateway.

When asked how to create a private HTTP gateway it happily tells you how to do it , but it’s actually impossible.

SV_BubbleTime · on July 7, 2023

Dumb embedded cave man here… I thought those two things were basically the same but REST had an idea of statelessness. They are different how?

jdlyga · on July 6, 2023

I really love this concept. While I do get better results from GPT-4 for AWS questions right now, AI as the "interpreter" for documentation works really well.

shanghaikid · on July 7, 2023

You can try https://osschat.io/ for some open source project.

flaminHotSpeedo · on July 6, 2023

The problem is garbage in -> garbage out.

When the docs are wrong or misleading you'll still get burned, even if the model doesn't hallucinate responses

lukebbutton · on July 6, 2023

Agreed, that's the problem eod. Was trying to set up an instance a few days ago and the docs for it hadn't been updated since '21

ghomem · on July 6, 2023

Simple AWS snapshot:

https://imgur.com/a/IGu1syf

scarface_74 · on July 6, 2023

I hate to be that guy. But what’s the purpose of this? What does this do that I can’t just do with ChatGPT?

fragmede · on July 7, 2023

For services with changes after Sept, 2021, this will give more up to date answers.

ghomem · on July 6, 2023

Please I beg you: ask it how to take a snapshot of an EC2 instance and then how to restore it :)

destroy-2A · on July 7, 2023

It still does not understand boolean logic in IAM especially when you have Not conditions.

jgalt212 · on July 6, 2023

is AWS designed to take all my money?

No, AWS is not designed to take all your money. AWS offers a variety of payment options and cost-saving measures to help you manage your expenses effectively. ...

jsninjasamurai · on July 7, 2023

This is down

coding123 · on July 6, 2023

seems super useful, try this prompt:

how do you use wrangler and glue to make athena tables using terraform

alsima · on July 6, 2023

Awesome to hear that it was useful for you!!!

shaimagz · on July 7, 2023

It’s not working…

alsima · on July 7, 2023

Thank you for bringing this to out attention! We are currently working on a fix.