Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Run] ChatGPT Plugin for PowerToys Run #25411

Closed
wants to merge 5 commits into from

Conversation

Simizfo
Copy link

@Simizfo Simizfo commented Apr 11, 2023

image

Old images (for history/collection sake)

image

Summary of the Pull Request

Hi there again! This is Simone, a few weeks ago I tweeted about a small prototype plugin I wrote in just 2 hours to have ChatGPT integrated into PowerToys Run.

Clint sent me a PM later that day and we had an amazing talk about making this real and public as a Community Plugin, so here I am!

PR Checklist

Detailed Description of the Pull Request / Additional comments

Here's how the Plugin currently works and what still needs to be decided

✅ Working

  • The plugin activates with the keyword ?? (currently replaces web search, needs to be changed, any suggestion?)
    • Plugin is OFF by default
  • You can write your question and terminate it with "?"
  • Once done, the OpenAI API gets queried and you get a response

⚒️ What needs to be done

  • Change the activation keyword
  • The user needs a way to input their own OpenAI API key. String settings aren't currently supported for Run Plugins, so this needs to be added separately ([Run] String-type settings support for PowerToys Run plugins #25326)
  • A new UI for better answer displaying needs to be made
  • Define a "click" action. Copy the full answer on the clipboard?
  • Replacing icons
  • Error handling
  • Localization
  • Explore new ideas on how this can be made better

Validation Steps Performed

None yet

Suggestions 💡

I'm totally open to suggestions, this is another reason why I'm opening this PR as a Draft, besides the Plugin not being ready.

@Simizfo
Copy link
Author

Simizfo commented Apr 11, 2023

@microsoft-github-policy-service agree

@Simizfo Simizfo changed the title Pt run chatgpt plugin [Run] ChatGPT Plugin for PowerToys Run Apr 11, 2023
@github-actions

This comment has been minimized.

@Simizfo Simizfo mentioned this pull request Apr 11, 2023
@htcfreek
Copy link
Collaborator

htcfreek commented Apr 11, 2023

Two thoughts on this:

  1. The plugin should be off by default. (You can look at the Onenote plugin as help reference.)
  2. We need logging of communication errors and detailed user messages like in calculator plugin. If possible we should show custom error texts to the user (e.g. "API key refused!").

@Simizfo
Copy link
Author

Simizfo commented Apr 11, 2023

Two thoughts on this:

Thank you!

  1. The plugin should be off by default. (You can look at the Onenote plugin as help reference.)

This is already in, I forgot to mention it, I'll edit the post now

  1. We need logging of communication errors and detailed user messages like in calculator plugin. If possible we should show custom error texts to the user (e.g. "API key refused!").

Definitely needed!

Thanks again 👍

@mdrejhon
Copy link

mdrejhon commented Apr 12, 2023

EDIT -- shortened this comment and moved most of comment to #25436 (comment)

Details

Early Bird Version 2 Suggestion

Integrate ChatGPT feature as plugin into proposal for customizable optional PrintScreen context menu

image

Merger of Snipping Tool & Text Extractor that has more features -- including Ability to ask an AI about what's on the computer screen with queries such as:

  • "...Where did this image originally come from?..."
  • "...How do I make this text bigger in the menus of this specific app?..."
  • "...What's the best way for me to test this shader example...?"
  • "...I don't understand this strange command line error, do you know why is this happening?..."
  • "...This popup error is new. Is there a security issue?..."
  • "...Explain this command line compiler error..."

[snip - bigger comment moved to #25436]

#25197 - PrintScreen Context Menu Idea

@htcfreek
Copy link
Collaborator

@Simizfo , @mdrejhon
Can we please open a new issue for the brainstorm v2 discussion. Then the PR discussion keeps clean and focused on v1 implementation.

@htcfreek
Copy link
Collaborator

htcfreek commented Apr 12, 2023

@Simizfo
For such a plugin we should add support for disabling Plugins using GPOs. (Maybe something generic in Wox's code based on the plugin GUIDs.)

@jaimecbernardo I suggest to discuss implementation via email and to create a new issue.

@Simizfo
Copy link
Author

Simizfo commented Apr 12, 2023

@Simizfo , @mdrejhon Can we please open a new issue for the brainstorm v2 discussion. Then the PR discussion keeps clean and focused on v1 implementation.

Alright, will do! (Edit: done! #25436)

@Simizfo For such a plugin we should add support for disabling Plugins using GPOs. (Maybe something generic in Wox's code based on the plugin GUIDs.)

@jaimecbernardo I suggest to discuss implementation via email and to create a new issue.

I honestly have no idea on how to make that work, would appreciate some directions 👀

@htcfreek
Copy link
Collaborator

htcfreek commented Apr 12, 2023

@Simizfo For such a plugin we should add support for disabling Plugins using GPOs. (Maybe something generic in Wox's code based on the plugin GUIDs.)

@jaimecbernardo I suggest to discuss implementation via email and to create a new issue.

I honestly have no idea on how to make that work, would appreciate some directions 👀

I can do the implementation then. I am working on writing my ideas down to communicate them and I should have them ready on weekend.

Then main point would be that the admin can define a table based on the plugin GUIDs and set the enabled state for each GUID. The evaluation of the policy value(s) will happen in the Wox.Plugin class.

@TheModdersDen
Copy link

TheModdersDen commented Apr 15, 2023

Just an FYI, the news site TechRadar made a post about this plugin. It might be wise to lock this to collaborators only...

EDIT: Also, cool plugin! Can't wait to see it implemented!

@AndrewJacksonZA
Copy link

Just an FYI, the news site TechRadar made a post about this plugin.

I got here from a Tom's Hardware post. https://www.tomshardware.com/news/microsoft-approves-chatgpt-plugin-for-venerable-powertoys-app

@eaingaran
Copy link

The title is kinda misleading. OpenAI's API interface is different from ChatGPT.

For example, I can have a subscription to ChatGPT plus, but I cannot use the API's without paying per token.

For the sake of clarity and for future plugins (perhaps a plugin for actual chatGPT, where people can login with their account instead of setting the API key), would it be better to rename the plugin?


public string GetChatGPTAnswer(string query)
{
JsonObject data = new JsonObject
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a strongly-typed model instead of managing JSON objects? Seems like the right way (even for debugging purposes) would be to have an actual pre-defined object here?

{
if (response.IsSuccessStatusCode)
{
JsonDocument doc = JsonDocument.Parse(response.Content.ReadAsStringAsync().Result);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - should this be deserialized into a managed object rather than create structure readers this way?

Result emptyResult = new()
{
Title = "Write your query to chatGPT",
SubTitle = "Don't forget to end your query with \"?\"",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ending the query with a question mark seems like an artificial constraint that is being put on the user when it could be addressed programmatically?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you propose? This has been made to avoid spamming the API, since it has a cost. Also you can get rate limited, and we don't want that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like there needs to be a better way to signal "I am done typing." You could use a timer (if no typing for N seconds, send query), or you require user input (e.g. press ENTER). Otherwise, you're also bound to run into issues where someone used the question mark in the middle of the sentence without the query, say "what does ? mean in C#."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the what does ? mean in C#. scenario, but not fully agree with the timer one.

Copy link
Collaborator

@dend dend Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation details are left as an exercise to the PR author 😀 At the end of the day if you do not have an existing event pipeline or a terminator key press, you need to know that a user stopped typing, which can, and generally is determined with the use of a timer behind the scenes.

Alternatively, you could even go as far as use Task.Delay when typing is performed in the box that resets a flag as needed. But that would rely on you having an event pipeline to hook into.

}
else
{
responseContent = $"Failed: {response.StatusCode} - {response.Content.ReadAsStringAsync().Result}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a possibility of the content being null?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to check. This is v0 code, belongs to the first hacked-up version I made in the first night.


if (settings != null && settings.AdditionalOptions != null)
{
personalAPIKey = ((PluginAdditionalStringOption)settings.AdditionalOptions.FirstOrDefault(x => x.Key == PersonalAPIKey))?.Value ?? string.Empty;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just return this value instead of storing it in a variable that is not used anywhere else but this function.

@dend
Copy link
Collaborator

dend commented Apr 17, 2023

The title is kinda misleading. OpenAI's API interface is different from ChatGPT.

For example, I can have a subscription to ChatGPT plus, but I cannot use the API's without paying per token.

For the sake of clarity and for future plugins (perhaps a plugin for actual chatGPT, where people can login with their account instead of setting the API key), would it be better to rename the plugin?

I'll second the proposal to rename this. While this plugin is constrained to ChatGPT, the underlying model can be easily selected through the REST API and can be extended in the future.

@Simizfo
Copy link
Author

Simizfo commented Apr 18, 2023

The title is kinda misleading. OpenAI's API interface is different from ChatGPT.

For example, I can have a subscription to ChatGPT plus, but I cannot use the API's without paying per token.

For the sake of clarity and for future plugins (perhaps a plugin for actual chatGPT, where people can login with their account instead of setting the API key), would it be better to rename the plugin?

Makes sense. Might be useful if someday Microsoft opens some kind of "Bing Chat API" and we want to implement it.

@Simizfo
Copy link
Author

Simizfo commented Apr 20, 2023

How is "Ask" for this plugin name? Pretty simple, and fits the theme

By the way, status update: I'm currently working on the ability to change the result shape to better fit the AI response. Will have more to share soon

@htcfreek
Copy link
Collaborator

How is "Ask" for this plugin name? Pretty simple, and fits the theme

I prefer "Ask ChatGPT"

By the way, status update: I'm currently working on the ability to change the result shape to better fit the AI response. Will have more to share soon

Here you can talk with @niels9001. He made some UI concepts in the past.

@github-actions
Copy link

github-actions bot commented May 2, 2023

@check-spelling-bot Report

🔴 Please review

See the 📂 files view or the 📜action log for details.

Unrecognized words (2)

gpt
openai

Previously acknowledged words that are now absent GPT :arrow_right:
To accept ✔️ these unrecognized words as correct and remove the previously acknowledged and now absent words, run the following commands

... in a clone of the git@github.com:Simizfo/PowerToys.git repository
on the pt-run-chatgpt-plugin branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.21/apply.pl' |
perl - 'https://github.com/microsoft/PowerToys/actions/runs/4865322998/attempts/1'
Available 📚 dictionaries could cover words not in the 📘 dictionary

This includes both expected items (2280) from .github/actions/spell-check/expect.txt and unrecognized words (2)

Dictionary Entries Covers
cspell:cpp/src/cpp.txt 30216 124
cspell:win32/src/win32.txt 53509 116
cspell:python/src/python/python-lib.txt 3873 30
cspell:php/php.txt 2597 19
cspell:node/node.txt 1768 13
cspell:typescript/typescript.txt 1211 12
cspell:python/src/python/python.txt 453 10
cspell:java/java.txt 7642 10
cspell:aws/aws.txt 218 9
cspell:r/src/r.txt 808 7

Consider adding them using (in .github/workflows/spelling2.yml):

      with:
        extra_dictionaries:
          cspell:cpp/src/cpp.txt
          cspell:win32/src/win32.txt
          cspell:python/src/python/python-lib.txt
          cspell:php/php.txt
          cspell:node/node.txt
          cspell:typescript/typescript.txt
          cspell:python/src/python/python.txt
          cspell:java/java.txt
          cspell:aws/aws.txt
          cspell:r/src/r.txt

To stop checking additional dictionaries, add:

      with:
        check_extra_dictionaries: ''
If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

@Simizfo
Copy link
Author

Simizfo commented May 2, 2023

image

Slow progress... next up:
  • Aligning things
  • Adding the icon

@htcfreek
Copy link
Collaborator

htcfreek commented May 3, 2023

Is there a copy action on pressing enter?

@Simizfo
Copy link
Author

Simizfo commented May 3, 2023

Is there a copy action on pressing enter?

This is a good idea!

@htcfreek
Copy link
Collaborator

htcfreek commented May 3, 2023

Is there a copy action on pressing enter?

This is a good idea!

It should be the result action that gets executed when clicking the result/pressing ENTER.

@Simizfo
Copy link
Author

Simizfo commented May 8, 2023

image

I've finally implemented a way to have custom result display modes implemented besides the current/original one. Other plugins can also benefit of this and easily implement custom displaying modes. I'll open another PR for this tomorrow!

@htcfreek
Copy link
Collaborator

htcfreek commented May 9, 2023

image I've finally implemented a way to have custom result display modes implemented besides the current/original one. Other plugins can also benefit of this and easily implement custom displaying modes. I'll open another PR for this tomorrow!

Great. Do we know the result/answer source (like Wikipedia)? The we should display it under the answer in light grey.

And btw I like the plugin icon.

@Simizfo
Copy link
Author

Simizfo commented May 9, 2023

Great. Do we know the result/answer source (like Wikipedia)? The we should display it under the answer in light grey.

gpt-3.5-turbo and 4.0 don't provide sources.

And btw I like the plugin icon.

thanks!

@Simizfo
Copy link
Author

Simizfo commented May 25, 2023

Question: Should I keep working on this with the recent announcement at Build 2023 of Windows Copilot? Looks like it covers the features of this plugin and even more 🤔

@asheroto
Copy link

asheroto commented May 25, 2023

Does it cover ChatGPT itself or just Windows' AI?

While I am a big supporter or Microsoft, my preference at the moment is still ChatGPT for LLMs.

According to ChatGPT + browsing plugin, when asked the difference between ChatGPT and Bing AI, here's the response:

image

Maybe someone from Microsoft could chime in?

My vote is to continue working on it. 😊 Thank you

@deinfluence
Copy link

Question: Should I keep working on this with the recent announcement at Build 2023 of Windows Copilot? Looks like it covers the features of this plugin and even more 🤔

My understanding is Windows Copilot will only be on Windows 11 and hence this plugin will be useful for those still running Windows 10 (with PowerToys).

@tiagorangel2011
Copy link

tiagorangel2011 commented Jul 6, 2023

I think this should use Bing Chat instead of our own API key

@bigplayer-ai
Copy link

Nice this what I was looking for can you add support for Bing chat? check my post #24445

Description of the new feature / enhancement

Hi,
I would like to have a convenient way to access Bing chat from any screen on my computer.
Bing chat is already integrated in the taskbar in the new Windows version, but I think it would be even better to have a hotkey that chat with it.

Scenario when this would be used?

This feature would be useful for getting fast and accurate answers to specific questions without interrupting the workflow.
For example, I could use Bing chat as a spotlight search on MacOS, so I can ask Bing chat questions and continue working. This would make the Bing chat experience more seamless.
When I press a hotkey, the chat pops up in a floating window and when I press that hotkey again, the chat hides away.
This would be an improved version of the Bing Desktop Search bar that was available for Windows 10 users via Edge.

Supporting information

Bing Desktop Search bar: https://www.neowin.net/news/following-windows-11-microsoft-pushes-bing-desktop-search-bar-to-windows-10-via-edge/

@tiagorangel2011
Copy link

tiagorangel2011 commented Jul 7, 2023 via email

@khot-aditya
Copy link

I am really looking for this plugin to go live. I think perplexity.ai is also a good option to integrate as it scraps live data from internet. @Simizfo really looking to test this plugin.

@Simizfo
Copy link
Author

Simizfo commented Sep 17, 2023

I have moved away from this since Microsoft has announced Windows Copilot and there was really small request for my plugin. If someone wants to have it, a fully-working prototype is available on my fork of PowerToys on my GitHub.

I'm closing this, thanks Clint for the great opportunity!

@Simizfo Simizfo closed this Sep 17, 2023
@D4n2021
Copy link

D4n2021 commented Sep 17, 2023

I have moved away from this since Microsoft has announced Windows Copilot and there was really small request for my plugin. If someone wants to have it, a fully-working prototype is available on my fork of PowerToys on my GitHub.

Can you please release your fork as .exe file? I don't have Visual Studio installed currently and thus can't manually build your fork :/

@Simizfo
Copy link
Author

Simizfo commented Sep 26, 2023

I have moved away from this since Microsoft has announced Windows Copilot and there was really small request for my plugin. If someone wants to have it, a fully-working prototype is available on my fork of PowerToys on my GitHub.

Can you please release your fork as .exe file? I don't have Visual Studio installed currently and thus can't manually build your fork :/

Can't, sorry 😢 all the builds I made have my own OpenAI API key, so I can't share it.

@htcfreek
Copy link
Collaborator

I have moved away from this since Microsoft has announced Windows Copilot and there was really small request for my plugin. If someone wants to have it, a fully-working prototype is available on my fork of PowerToys on my GitHub.

Can you please release your fork as .exe file? I don't have Visual Studio installed currently and thus can't manually build your fork :/

Can't, sorry 😢 all the builds I made have my own OpenAI API key, so I can't share it.

What about releasing a public version on your GH account that supports storing the key in a string setting (supported on 0.75.0 and higher).

@htcfreek htcfreek mentioned this pull request Sep 26, 2023
11 tasks
@mogwai
Copy link

mogwai commented Feb 26, 2024

Would be great to see this added in as an alternative and probably more simple version of what I imagine microsoft will implent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet