五维数据笔试题

用户454

2025年11月7日修改

同学你好，

非常感谢您对杭州五维数据有限责任公司的关注。招聘包含 1 轮笔试和 1-2 轮面试。请在收到邮件后按照应聘岗位和能力选择性完成以下笔试题。​

笔试题-前端：

Implement this design using a front-end framework or HTML/CSS. Submit a Github Link.​

https://www.figma.com/design/D0jjN7Twkq0DGm3ZJvQabz/%E5%89%8D%E7%AB%AF%E5%B7%A5%E7%A8%8B%E5%B8%88%EF%BC%88%E5%AE%9E%E4%B9%A0%EF%BC%89?node-id=0-1&p=f

笔试题-后端：

Implement a project using SpringBoot 3.2.0 . With a GET method HelloWorld API that requires SpringSecurity authentication, which returns a string saying "Hello World". And a username-password login API (could use username test and password 123456). Submit a GitHub link.​

笔试题-爬虫：

Using both Playwright and DrissionPage to crawl https://mitadmissions.org/blogs/, save and submit the results to a CSV file. Your headers should include the following:

笔试题-Python：

Implement a demo of streaming API with fast API.

笔试题-算法：

Auto Labeling with LLM / VLM / Other foundation models:

Choose any public dataset (text, image, or multimodal) that you are familiar with, and demonstrate how a foundation model (e.g., LLM, VLM) can automatically generate labels for it.​

What You Need to Do:

1.
Dataset and Task Description​
◦
Which dataset did you choose?​
◦
What is the input feature (e.g., text, image, etc.)?​
◦
What is the labeling target (e.g., sentiment, topic, category, object type, etc.)?​

2.
Model and Method​
◦
Which model did you use (e.g., GPT, Qwen, CLIP, etc.)?​
◦
How did you use it? (API or local inference)​
◦
Describe your prompt or input format design.​

3.
Runnable Code (Core Part)​
◦
Provide runnable code (preferably Python) that performs labeling.​
◦
Show how you read data, call the model, and output results.​
◦
Include a few example outputs (sample + model label).​

4.
Result Analysis​
◦
Describe what patterns you observe in the model’s labeling results.​
◦
When does the model perform well or fail?​
◦
If human labels exist, briefly compare accuracy or agreement.​

5.
Human vs Model Labeling​
◦
Briefly compare their advantages and limitations, such as: speed, cost, consistency, semantic understanding, or bias.​

6.
Improvement Ideas (Short Discussion)​
◦
Propose at least two possible improvements, such as: better prompts, multi-model voting, confidence filtering, or human-in-the-loop refinement.​

Submission:

Please submit a compressed folder including:

•
demo_code.py or a Jupyter Notebook;​

•
Example output file;​

•
A short written report (Markdown or PDF, 1–2 pages) describing your dataset, model, results, and insights.​

注意事项：

五维数据笔试题​

五维数据笔试题