Dify supports custom API domain names for OpenAI and any large model API server compatible with OpenAI. In the community edition, you can fill in the target server address through Settings —> Model Providers —> OpenAI —> Edit API.
In natural language processing, longer text outputs usually require more computation time and resources. Therefore, limiting the length of the output text can reduce computational cost and time to some extent. For example, setting max_tokens=500 means only considering the first 500 tokens of the output text, and any part beyond this length will be discarded. This ensures that the output text length does not exceed the LLM’s acceptable range and optimizes computational resources, improving model efficiency. Additionally, setting a smaller max_tokens allows for a longer prompt. For instance, gpt-3.5-turbo has a limit of 4097 tokens; if max_tokens=4000, only 97 tokens are left for the prompt, and exceeding this will cause an error.
In some natural language processing applications, texts are typically split by paragraphs or sentences to better handle and understand the semantic and structural information in the text. The smallest splitting unit depends on the specific task and technical implementation. For example:
Finally, experiments and evaluations are needed to determine the most suitable embedding technique and splitting unit. You can compare the performance of different techniques and splitting units on the test set and choose the optimal solution.
We use cosine similarity. The choice of distance function is generally not critical. OpenAI embeddings are normalized to a length of 1, which means:
Using dot product can slightly speed up the calculation of cosine similarity
Cosine similarity and Euclidean distance will result in the same ranking
When embedding vectors are normalized to a length of 1, calculating the cosine similarity between two vectors can be simplified to their dot product. Since the normalized vector lengths are all 1, the dot product result is equivalent to the cosine similarity result. Given that dot product operations are faster than other similarity measures (like Euclidean distance), using normalized vectors for dot product calculations can slightly improve computational efficiency.
We collaborate with major model providers to offer a certain amount of free token trial quotas to Chinese users. Through Dify Settings —> Model Providers —> Show more model providers, click “Get Free” on the Zhipu·AI, iFlytek Spark, or MiniMax icons. If you can’t see the entrance in the English interface, switch the product language to Chinese:
Once the trial quota is credited, select the model you need to use in Prompt Arrangement —> Model and Parameters —> Language Model.
This indicates that your OpenAI key’s account has run out of funds. Please go to OpenAI to recharge.
Error one:
Error two:
Please check if you have reached the official API call rate limit. Refer to the OpenAI official documentation for details.
First, check if the front-end and back-end versions are the latest and consistent. Second, this error may occur because you are using an Azure OpenAI key but have not successfully deployed the model. Check if the model is deployed in your Azure OpenAI. The gpt-3.5-turbo model version must be 0613 or above (as versions before 0613 do not support the function call capability used by Zhichat, making it unusable).
Usually, this is due to your environment setting a proxy. Please check if a proxy is set.
Each model has different parameter values. Set the parameter values according to the current model’s range.
In the parameter settings on the orchestration page, reduce the value of “max token.”
The default model can be configured in Settings - Model Providers. Currently, it supports text generation models from providers like OpenAI / Azure OpenAI / Anthropic, and also supports integration of open-source models hosted on Hugging Face / Replicate / xinference.
Check if the API key for the Embedding model you are using has reached the rate limit.
If you encounter the error “Invalid token,” try the following solutions:
Currently, the maximum size for a single document upload is 15MB, with a total document limit of 100. If you need to adjust these limits for a locally deployed version, refer to documentation.
Because Claude does not support the Embedding model, the Embedding process and other dialogue generation by default use OpenAI’s key, thus consuming OpenAI’s quota. You can also set other default inference models and Embedding models in Settings - Model Providers.
Whether to use the dataset depends on the dataset’s description. Make the dataset description as clear as possible. For specific writing techniques, refer to this documentation.
Set the header in the first row, and display content in each subsequent row without additional header settings or complex table formats.
For example, in the table below, only retain the second row’s header. The first row (Table 1) is an extra header and should be removed.
OpenAI’s GPT-4 model API and ChatGPT Plus are two separate products with separate charges. The model API has its own pricing. Refer to OpenAI pricing documentation. To apply for paid access, you must first bind a card. Binding a card grants GPT-3.5 access but not GPT-4 access. GPT-4 access requires a paid bill. Refer to OpenAI official documentation for details.
Dify supports the following for use as Embedding models. Simply select the Embeddings
type in the configuration box.
This feature provides application templates for cloud version users to reference, and currently does not support setting your created applications as templates. If you use the cloud version, you can Add to Workspace or Customize it to become your own application. If you use the community version and need to create more application templates for your team, you can contact our commercialization team for paid technical support: business@dify.ai
.
Edit this page | Report an issue