Currently, the process of importing a 700MB zip file containing textures for a diffuse map LoRA into the Civitai On-Site LoRA Trainer involves approximately 10-20 failed upload attempts. This cumbersome process is in contrast to the availability of standardized data from actual servers, as the data is already hosted on HuggingFace (along with numerous other datasets).
To streamline the import process and enhance efficiency, it would be beneficial to provide a user-friendly interface that allows for direct input of the dataset name, such as “alastandy/Diffuse_Map_Surfaces” or the corresponding URL, or the DOI 10.57967/hf/3756. This would enable users to specify the desired dataset without the need for manual file uploads.
Implementing this functionality would be relatively straightforward using the existing datasets Python packages available at https://github.com/huggingface/datasets. Alternatively, the data could be copied directly from the automatically generated DuckDB or Parquet files for an even more efficient approach.
By default, the captions for the dataset are stored in the metadata.jsonl file located within the dataset directory. To convert the captions to a per-file format compatible with the On-Site LoRA Trainer, a simple Python script can be utilized. For example, when I am converting to the per-file format user by the On-Site LoRA Trainer, this is the Python script I use to convert it:
import os
import json
# Specify the path to the metadata file
metadata_file = 'metadata.jsonl'
# Create a directory for the text files if it doesn't exist
output_dir = 'captions'
os.makedirs(output_dir, exist_ok=True)
# Read the metadata.jsonl file line by line
with open(metadata_file, 'r') as f:
for line in f:
# Parse the JSON line into a dictionary
data = json.loads(line.strip())
# Extract file_name and prompt
file_name = data.get("file_name", "")
prompt = data.get("prompt", "")
# Create the corresponding .txt file name by replacing .png with .txt
base_name = os.path.splitext(file_name)[0] # Strip .png extension
txt_file_name = f"{base_name}.txt"
# Define the full path for the new .txt file
txt_file_path = os.path.join(output_dir, txt_file_name)
# Write the prompt into the .txt file
with open(txt_file_path, 'w') as txt_file:
txt_file.write(prompt)
print(f"Created: {txt_file_path}")Please authenticate to join the conversation.
Awaiting Dev Review
💡 Feature Request
About 1 year ago

alastandy
Get notified by email when there are changes.
Awaiting Dev Review
💡 Feature Request
About 1 year ago

alastandy
Get notified by email when there are changes.