Overview
FooocusAPI is a refactor of Fooocus-API, they are both the secondary development of Fooocus. It is used to solve the problem that the API call of Gradio in the original project is difficult to understand and difficult to meet the use of Fooocus as a service deployment.
How about Fooocus work
When you start reading the code of Fooocus, you will find that most of the main logic of the parameters, task management, etc. are concentrated in the async_worker.py file.
Because Fooocus v2.5.0 refactored this file, so here is a brief explanation of the work logic of this file at 2.4.3:
First,a class AsyncTask
is defined to instantiate task objects
args
: Used for store task parameters, it is a list, which contains the information submitted by the user from WebUIyields
: Used for storing the progress, intermediate results, and preview images generated during the task execution. It is a listresults
:Used for store the final result of the task, it is a list, and will contain the local path of the generated imagelast_stop
:Used for record the last stop status of the task, which can beskip
orstop
, and will be checked in the loop of the task executionprocessing
:Used for record the current status of the task, if it is True, it means that the task is being executed
And then, defined a list async_tasks = []
, which is a queue of tasks to be executed, and there is a loop that will try to take tasks from it and execute them.
Next is a long 1000-line method worker
, most of which are various parameter processing, but no problem, we start from the beginning one by one.
When a task is submitted, the information submitted by the user from WebUI will be instantiated as a task object and added to the async_tasks
list, the process is in the 50th line of webui.py
, next, the loop that checks the async_tasks
list will take the task and execute it. This code is located at the end of async_tasks.py
:
After task is taken, it will be passed to handler
for processing (135 line)
First, handler
will take all the parameters and do some preliminary processing, this part end at 220 line
Next, there is further processing of the parameters, such as case conversion, determining the number of steps based on the selected performance, adjusting the model, lora, and corresponding other parameters, and performing possible model downloads, which should be done around line 343
And then defined two list: goals = []
and tasks = []
, They are respectively used to store image processing labels, i.e uov
inpaint
ip
, tasks
Used to split tasks, when image_number > 1
the task will be split into multiple tasks, and the tasks
list will be used to store the split tasks
Next, depending on whether input_image
is checked, this part of the code is executed. Its function is to add markers to the goals
list based on the current tab
and the checked status of mixing_image_prompt_and_vary_upscale
and mixing_image_prompt_and_inpaint
. At the same time, various uploaded images are preprocessed and possible model downloads are performed. This section is located approximately at lines 348-422
.
Regardless of whether input_image
is checked or not, the code will proceed to the skip_prompt_processing
judgment after a brief loading of the model and overlay of parameters. This logic is located at lines 448-549
. Its function is to expand the description words and reverse description words based on the selected styles
for model optimization.
What follows is a series of processing steps based on the content of goals
. Apart from upscale fast
, which will return the result directly, the other situations are still processed in stages until line 868, where the tasks
list is iterated over. If everything goes smoothly, the final processing will be done here, such as formatting metadata, saving files, and returning the results.
Throughout the entire process, the status of task execution is continuously updated through the yields
property of the task object. By using the callback
function, we can clearly see the storage structure in the list:
After a simple calculation, a list similar to the following will be obtained: ['preview', (60, 'Sample step 60/100, image 1/1 ...', y)]
. The meaning of the elements in this list is as follows:
preview
:This is similar to a phase identifier, and the information it can provide is limited.Tuple:
60: The progress, which is easy to understand, refers to the overall progress.
'Sample step 60/100, image 1/1 ...': The description of the current step
y:This refers to the image for each step, which means that using this, you can see the process of an image being generated.
Thinking When Reconstructing
In the Fooocus-API project created by konieshadow, he implemented a new task queue and built new task objects based on FastAPI. Then, by rewriting some of the logic in async_worker.py
, he completed the development of Fooocus-API.
After taking over and maintaining the project for half a year, the issues caused by this processing method have become increasingly difficult to handle. The main problems are the following two:
When dealing with updates to the Fooocus version, it is necessary to synchronously update the code in
async_worker.py
. As a generator, one must be careful to handle each change.The startup of the project is based on the premise of starting a FastAPI service, which prevents the reuse of the pre-startup logic in Fooocus and requires reimplementation. This includes tasks such as dependency installation, environment detection, configuration file reading, etc. Although these can be achieved through simple code duplication.
Additionally, there are some historical minor issues, such as:
Inability to use WebUI simultaneously.
Need to request a separate EndPoint to obtain progress images.
Incomplete persistence of task information and inconsistent return data formats.
Based on the above issues, I have decided to refactor Fooocus-API with the following approach:
Utilize the task handling logic in
async_worker.py
, with the API solely responsible for receiving parameters and passing them to the task handling logic.Abandon the separately maintained queue and reuse the queue in
async_worker.py
.Merge interface functions, since all parameters are ultimately processed through the
handler
function inasync_worker.py
, the API only needs to be responsible for receiving parameters and passing them on. Separate interfaces are unnecessary.
How about FooocusAPI work
So, I redesigned the structure of FooocusAPI.
Add a method in webui.py
to start the API service and WebUI, at this time, we can use WebUI and API service at the same time
And then, put all the parameters into a model CommonRequest
After that, I added a new function pre_process
, which is used to preprocess the parameters, e.g. convert params, save image, download model, etc.
And then, use api_utils
to process the parameters, and add it to async_tasks
list in async_worker.py
In the end, call_worker
will monitor the execution status of the task and return different results based on different parameters when the task is completed.
After the refactoring is completed, the original functionality will be preserved to the maximum extent possible, while also allowing API services to coexist with WebUI. Although modifications to asyncw_worker.py
are still required for features that are not available in Fooocus, such as custom magnification and support for Outpaint customization, the amount of modifications has been greatly reduced. Additionally, due to the limited modifications made to Fooocus, the maintenance cost of the API is greatly reduced in the absence of major version changes, making it easier to track updates to Fooocus. For minor updates, simply merging upstream code is sufficient.