• Turing Post Korea
  • Posts
  • ๐ŸŒFOD#116: แ„‡แ…ฆแ†ซแ„Žแ…ตแ„†แ…กแ„แ…ตแ†ผแ„‹แ…ด แ„€แ…จแ„Œแ…ฅแ†ฏ - AIแ„‹แ…ด แ„‰แ…ขแ„…แ…ฉแ„‹แ…ฎแ†ซ แ„Žแ…ฅแ†จแ„ƒแ…ฉแ„…แ…ณแ†ฏ แ„แ…กแ†ทแ„€แ…ฎแ„’แ…กแ†ซแ„ƒแ…ก

๐ŸŒFOD#116: แ„‡แ…ฆแ†ซแ„Žแ…ตแ„†แ…กแ„แ…ตแ†ผแ„‹แ…ด แ„€แ…จแ„Œแ…ฅแ†ฏ - AIแ„‹แ…ด แ„‰แ…ขแ„…แ…ฉแ„‹แ…ฎแ†ซ แ„Žแ…ฅแ†จแ„ƒแ…ฉแ„…แ…ณแ†ฏ แ„แ…กแ†ทแ„€แ…ฎแ„’แ…กแ†ซแ„ƒแ…ก

+ แ„€แ…ณแ†ทแ„Œแ…ฎแ„‹แ…ด แ„Œแ…ฎแ„‹แ…ญ แ„‚แ…ฒแ„‰แ…ณ แ„†แ…ตแ†พ แ„‹แ…งแ†ซแ„€แ…ฎ

๋ฒค์น˜๋งˆํ‚น์˜ ์‹œ์ฆŒ

๋ชจ๋ธ ์ถœ์‹œ ์†Œ์‹์„ ํŒ”๋กœ์šฐ์—…ํ•˜๋А๋ผ ๋ฐ”๋นด๋˜ ์ง€์ง€๋‚œ ์ฃผ๊นŒ์ง€์™€๋Š” ๋‹ค๋ฅด๊ฒŒ, ์ง€๋‚œ ์ฃผ์˜ AI ์”ฌ์€, ๋ชจ๋ธ ์ถœ์‹œ๋ผ๋Š” ์ธก๋ฉด์—์„œ๋Š” ๋น„๊ต์  ์กฐ์šฉํ–ˆ๋‹ค ์‹ถ์Šต๋‹ˆ๋‹ค.

๊ทธ ์ค‘์— ๊ฐ€์žฅ ์ฃผ๋ชฉํ•  ๋งŒํ•œ ์†Œ์‹์€ Gemini 2.5 Flash Image(โ€˜๋‚˜๋…ธ ๋ฐ”๋‚˜๋‚˜โ€™)์˜ ์ถœ์‹œ์˜€๋˜ ๊ฒƒ ๊ฐ™๊ตฌ์š”. ๊ฐœ์ธ์ ์œผ๋กœ๋Š” Gemini์˜ ๋งˆ์ผ€ํŒ… ํŒ€์ด โ€˜๋“œ๋””์–ด!โ€™ ์ด๋ฆ„์„ ์ œ๋Œ€๋กœ ์ง€์€ ๊ฒŒ ์•„๋‹Œ๊ฐ€ ์‹ถ์–ด์š”. ์ด๋ฏธ ๋‚˜๋…ธ ๋ฐ”๋‚˜๋‚˜์— ๋Œ€ํ•ด์„œ๋Š” ๋งŽ์€ ๋ถ„๋“ค์ด ์•Œ๊ณ  ๊ณ„์‹œ์ง€๋งŒ, ๊ฐ€์žฅ ํฌ๊ฒŒ ๊ด€์‹ฌ๋ฐ›๋Š” ๊ฑด ์—ญ์‹œ โ€˜์บ๋ฆญํ„ฐ์˜ ์ผ๊ด€์„ฑ์„ ์ž˜ ์œ ์ง€ํ•œ๋‹คโ€™๋Š” ์ธก๋ฉด์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Image Credit: Global Newswire

์ธ๋ฌผ์ด ํ—ค์–ด์Šคํƒ€์ผ์„ ๋ฐ”๊พธ๋“ , ์ž์„ธ๋ฅผ ๋ฐ”๊พธ๋“ , ์˜์ƒ์„ ๋ฐ”๊พธ๋“ , ๋ณธ๋ž˜ ์ธ๋ฌผ๊ณผ ๊ฑฐ์˜ ๋˜‘๊ฐ™์ด ์ธ์‹์ด ๋ฉ๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ฒˆ ํŽธ์ง‘ํ•˜๋Š” ๊ณผ์ •์—์„œ๋„ ์ผ๊ด€์„ฑ์ด ์ž˜ ์œ ์ง€๋˜์–ด์„œ, ๊ด‘๊ณ  ์˜์ƒ ์ œ์ž‘ ๊ฐ™์€ ์˜์—ญ์— ๊ฝค ์ž„ํŒฉํŠธ๊ฐ€ ์žˆ์„ ๊ฑธ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค.

Microsoft AI๋Š” ์ž์ฒด์ ์œผ๋กœ ๊ฐœ๋ฐœํ•œ ์ฒซ ๋ฒˆ์งธ ๋ชจ๋ธ, MAI๋ฅผ ์„ ๋ณด์˜€๋Š”๋ฐ์š”. ์ผ๋‹จ, ์ด ๋ชจ๋ธ์€ ์ดˆ๊ณ ์† ์ฒ˜๋ฆฌ, ์ž์—ฐ์Šค๋Ÿฌ์šด ์Œ์„ฑ, ํšจ์œจ์  ํ›ˆ๋ จ ๊ธฐ๋ฒ•, ๊ทธ๋ฆฌ๊ณ  ๊ฒฐ๊ณผ์ ์œผ๋กœ ์•„์ฃผ ๊ฐ•๋ ฅํ•œ ๋ฒค์น˜๋งˆํฌ ์„ฑ๋Šฅ์„ ์ž๋ž‘ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์‹ค MAI ๋ชจ๋ธ ์ถœ์‹œ๋Š” ์˜คํ”ˆAI์™€์˜ ๊ด€๊ณ„ ๋ณ€ํ™”์™€๋„ ๋ฐ€์ ‘ํ•œ ์—ฐ๊ด€์ด ์žˆ๋‹ค๊ณ  ํ•ด์„ํ•˜๋Š” ๋ถ„๋“ค์ด ๋งŽ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ์— ๋ฐœํ‘œ๋œ MAI-Voice-1, MAI-1-preview ์ด ๋‘ ๊ฐ€์ง€ ๋ชจ๋ธ๋„ ์˜คํ”ˆAI์— ๋Œ€ํ•œ ๊ธฐ์ˆ  ์˜์กด๋„๋ฅผ ์ค„์ด๋ ค๋Š” ์ „๋žต์˜ ์ผํ™˜์ด๋ผ๊ณ  ๋ด์•ผ๊ฒ ์ฃ .

์˜คํ”ˆAI vs. ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ์˜ ๊ฐˆ๋“ฑ์€ ์—ฌ๋Ÿฌ ์ธต์œ„์—์„œ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€๋ถ„ ๋ฌธ์ œ๋ฅผ ๋‘๊ณ ๋Š” ๊ณจ๋“œ๋งŒ์‚ญ์Šค, ๋ชจ๊ฑด์Šคํƒ ๋ฆฌ ๊ฐ™์€ ๊ธˆ์œต์‚ฌ๋ฅผ ํ†ตํ•ด์„œ ์ปจ์„คํŒ…์„ ๋ฐ›๊ณ  ์žˆ๋‹ค๊ณ  ์•Œ๋ ค์ ธ ์žˆ๋Š”๋ฐ, ์˜คํ”ˆAI๊ฐ€ ์˜๋ฆฌ ๊ธฐ์—…์œผ๋กœ ์ „ํ™˜ํ•˜๋Š” ๊ณผ์ •์—์„œ ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ์˜ 49% ์ง€๋ถ„์œจ๊ณผ ํˆฌ์ž ์กฐ๊ฑด์— ๋Œ€ํ•œ ์˜๊ฒฌ ์ฐจ์ด๊ฐ€ ํ•ต์‹ฌ์ ์ธ ๊ฐˆ๋“ฑ ์š”์†Œ๋ผ๊ณ  ํ•ด์š”. ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ๋Š” ์ž‘๋…„ ์—ฐ๋ก€ ๋ณด๊ณ ์„œ์— ์ด๋ฏธ ์˜คํ”ˆAI๋ฅผ ๊ฒฝ์Ÿ์‚ฌ๋กœ ๋ช…์‹œํ–ˆ๊ณ , ๋”ฐ๋ผ์„œ ๋‘ ํšŒ์‚ฌ๋Š” ๊ฐ™์€ AI ๋ชจ๋ธ์„ ํŒ๋งคํ•˜๋ ค๊ณ  ๊ณ ๊ฐ์„ ๋‘๊ณ  ๊ฒฝ์Ÿํ•˜๋Š” 'ํ”„๋ ˆ๋„ค๋ฏธ(Frenemies)' ๊ด€๊ณ„๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์˜คํ”ˆAI ์ธก์€ ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ๊ฐ€ ์ถฉ๋ถ„ํ•œ ์ปดํ“จํŒ… ํŒŒ์›Œ๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ๋ถˆ๋งŒ์„ ํ‘œ์‹œํ•˜๊ณ  ์žˆ๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.

โ€˜๋ชจ๋ธ ์ถœ์‹œโ€™๋Š” ์ด ์ •๋„์ธ ๊ฒƒ ๊ฐ™์ง€๋งŒ, ๋Œ€์‹ ์— ๋ˆˆ์— ๋„๋Š”๊ฒŒ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค: ๊ฝค ๋งŽ์€ โ€˜๋ฒค์น˜๋งˆํฌโ€™, ๊ทธ๋ฆฌ๊ณ  โ€˜ํ‰๊ฐ€ ์‹œ์Šคํ…œโ€™์ด ๋“ฑ์žฅํ–ˆ๋‹ค๋Š” ๊ฒ๋‹ˆ๋‹ค.

ํŠœ๋ง ํฌ์ŠคํŠธ์—์„œ๋„ ๋ฒค์น˜๋งˆํฌ์™€ ๊ด€๋ จ๋œ ์—ฌ๋Ÿฌ ๊ฐ๋„์˜ ๋ง์”€์„ ๋“œ๋ฆฐ ์ ์ด ์žˆ์ง€๋งŒ, ์ผ๋‹จ ๋ฒค์น˜๋งˆํฌ๋Š” ๊ฒ‰๋ณด๊ธฐ์™€๋Š” ๋‹ค๋ฅด๊ฒŒ ๋‹จ์ˆœํ•œ, ์ค‘๋ฆฝ์ ์ธ ์ ์ˆ˜ํŒ์ด ์•„๋‹ˆ์ฃ . ๋จผ์ €, ๊ฐ๊ฐ์˜ ๋ฒค์น˜๋งˆํฌ๋Š” ๋‚˜๋ฆ„์˜ ์ฒ ํ•™์„ ๋‹ด๊ณ  ์žˆ์–ด์š”: ์–ด๋–ค ์ž‘์—…์ด ์ค‘์š”ํ•œ์ง€, ์„ฑ๊ณต์˜ ๊ธฐ์ค€์€ ๋ญ”์ง€, ๋ญ๋Š” ๋ฌด์‹œํ•ด๋„ ๋ฌด๋ฐฉํ•œ์ง€ ๋“ฑ์„ ๊ฐ์ž ์ •์˜ํ•˜๊ณ  ์žˆ์œผ๋‹ˆ๊นŒ์š”.

ImageNet์ด ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์˜ ์„ฑ๊ณผ๋ฅผ ํ˜์‹ ์ ์œผ๋กœ ๋Œ์–ด์˜ฌ๋ฆฌ๋Š”๋ฐ ๊ธฐ์—ฌํ–ˆ๊ณ , SQuAD๋Š” - ์•„์‰ฝ๊ฒŒ๋„ - ๋ชจ๋ธ๋“ค์ด โ€˜์ œ๋Œ€๋กœ ๋œ ์ดํ•ดโ€™๊ฐ€ ์—†์ด๋„ ๋‹ต์„ ํ•  ์ˆ˜ ์žˆ๊ฒŒ๋” ๋˜์–ด ์žˆ์–ด์„œ ์ผ์ข…์˜ ์™œ๊ณก์„ ์ดˆ๋ž˜ํ•˜๊ธฐ๋„ ํ–ˆ๊ณ , GLUE๋Š” ์ง€๊ธˆ์€ ํฌํ™” ์ƒํƒœ์— ์ด๋ฅด๋Ÿฌ์„œ ๋” ์ด์ƒ ๊ทธ ์˜๋ฏธ๋ฅผ ์žƒ์€ ๋ฒค์น˜๋งˆํฌ๋ผ๊ณ  ํ•ด์•ผ๊ฒ ์ฃ . ์ข‹์€ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์„ค๊ณ„ํ•˜๋Š” ๊ฑด ๋ชจ๋ธ ์„ค๊ณ„๋งŒํผ์ด๋‚˜ ์–ด๋ ต๊ณ , ๋˜ ๊ทธ ํŒŒ๊ธ‰๋ ฅ๋„ ์–ด๋งˆ์–ด๋งˆํ•ฉ๋‹ˆ๋‹ค.

์ง€๋‚œ ํ•œ ์ฃผ, ํ™์ˆ˜๊ฐ™์ด ์Ÿ์•„์ง„ ๋ฒค์น˜๋งˆํฌ๋“ค

์•„๊นŒ ๋ง์”€๋“œ๋ ธ๋“ฏ์ด, ์ง€๋‚œ ์ฃผ์—๋งŒ ๊ณต์‹์ ์ธ ๋ฒค์น˜๋งˆํฌ 7๊ฐ€์ง€, ๊ทธ๋ฆฌ๊ณ  ๊ทธ์— ์ค€ํ•˜๋Š” ํ‰๊ฐ€ ์‹œ์Šคํ…œ 6๊ฐ€์ง€๊ฐ€ ๋“ฑ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹จ์ˆœํžˆ ๋“ฑ์žฅํ–ˆ๋‹ค๋Š” ์ž์ฒด๊ฐ€ ์ค‘์š”ํ•˜๋‹ค๊ธฐ๋ณด๋‹ค, ์ด ๋‚ด์šฉ์„ ๋“ค์—ฌ๋‹ค๋ณด๋ฉด, ์ด ๋ฒค์น˜๋งˆํฌ์™€ ํ‰๊ฐ€์‹œ์Šคํ…œ๋“ค์ด ์•ž์œผ๋กœ AI๊ฐ€ ์–ด๋–ป๊ฒŒ ํ˜๋Ÿฌ๊ฐˆ ๊ฒƒ์ธ์ง€ ๋ฐฉํ–ฅ์„ ๋ณด์—ฌ์ค€๋‹ค๋Š”๊ฒŒ ์˜๋ฏธ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์—์ด์ „ํŠธ์˜ ์ž‘์—… ๋ฒ”์œ„(Agentic Work)

    MCP-Bench๋Š” ๋ชจ๋ธ ์ปจํ…์ŠคํŠธ ํ”„๋กœํ† ์ฝœ(MCP)์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ด์„œ LLM์˜ ๋„๊ตฌ ์‚ฌ์šฉ, ๋‹ค๋‹จ๊ณ„ ์ž‘์—… ์ˆ˜ํ–‰, ๊ทธ๋ฆฌ๊ณ  ๋ณต์žกํ•œ ๊ณ„ํš ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ข…ํ•ฉ ๋ฒค์น˜๋งˆํฌ์ธ๋ฐ์š”, ํŠนํžˆ โ€˜์—์ด์ „ํŠธ๊ฐ€ ์„œ๋ฒ„์™€ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋‹ค๋‹จ๊ณ„์˜ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ…Œ์ŠคํŠธโ€™ํ•ฉ๋‹ˆ๋‹ค.

    ReportBench๋Š” ์—ฐ๊ตฌ ์—์ด์ „ํŠธ์˜ ํ•™์ˆ  ๋ณด๊ณ ์„œ ์ž‘์„ฑ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์„ค๊ณ„ํ•œ ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค. ๋‹จ์ˆœํ•œ ํ€ด์ฆˆ๊ฐ€ ์•„๋‹Œ ํ•™๋ฌธ์  ๋…ธ๋™์˜ ๊นŠ์ด์™€ ์งˆ์„ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค.

  • ๋„๋ฉ”์ธ ํŠนํ™”(Domain Specificity)

    CMPhysBench๋Š” AI ๋ชจ๋ธ์ด ์‘์ง‘๋ฌผ์งˆ๋ฌผ๋ฆฌํ•™์˜ ๋ณต์žกํ•œ ๊ฐœ๋…, ๊ทธ๋ฆฌ๊ณ  ๋ฌธ์ œ๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ดํ•ดํ•˜๊ณ  ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

    AetherCode๋Š” ๊ฒฝ์Ÿ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋Šฅ๋ ฅ์„, MovieCORE๋Š” ์˜ํ™”์— ๋Œ€ํ•œ ์ธ์ง€์ ์ธ ์ถ”๋ก ์„ ํ‰๊ฐ€ํ•œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

    ์ง€๋‚œ ๋ฒˆ ํŠœ๋งํฌ์ŠคํŠธ์—์„œ ์ž ๊น ์–ธ๊ธ‰ํ–ˆ๋˜, ๊ด‘๊ณ  ์‚ฐ์—…์—์„œ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋งŒ๋“  Creativity Benchmark๋„ ์—ญ์‹œ ๋„๋ฉ”์ธ ํŠนํ™” ๋ฒค์น˜๋งˆํฌ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๊ฒ ์Šต๋‹ˆ๋‹ค.

  • ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๊ฐ„์˜ ์ถ”๋ก (Reasoning across Modalities)

    T2I-ReasonBench๋Š” Text-to-Image ์ƒ์„ฑ ์ž‘์—…์—์„œ AI์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ, SEAM์€ ์–ธ์–ด์™€ ๋น„์ „ ๊ฐ„์˜ ์˜๋ฏธ์  ๋™๋“ฑ์„ฑ์„ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋งŒ๋“ค์–ด์ง„ ํ‰๊ฐ€ ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค.

    SpotEdit๋Š” ๋น„์ฃผ์–ผ ํŽธ์ง‘์˜ ์ •๋ฐ€๋„๋ฅผ ํ…Œ์ŠคํŠธํ•˜๋Š” ๊ธฐ์ค€์œผ๋กœ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋งŒ๋“ค์–ด์ง„ ๋ฒค์น˜๋งˆํฌ์ด๊ตฌ์š”.

    ์œ„์˜ ์„ธ ๊ฐ€์ง€ ๋ฒค์น˜๋งˆํฌ๋Š” AI๊ฐ€ ๋‹ค์ค‘ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ํ™˜๊ฒฝ์—์„œ ํ•ฉ๋ฆฌ์ ์œผ๋กœ ์ž‘๋™ํ•˜๊ณ  ์ •ํ™•ํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ์ข…ํ•ฉ์ ์œผ๋กœ ์ธก์ •ํ•˜๋Š” ๋ฐ ๊ธฐ์—ฌํ•˜๋Š” ๊ฒƒ๋“ค์ž…๋‹ˆ๋‹ค.

  • ์•ˆ์ „์„ฑ๊ณผ ์ ์‘์„ฑ(Safety and Adaptivity)

    Mind the Third Eye!๋Š” ์Šค๋งˆํŠธํฐ ์—์ด์ „ํŠธ์˜ ํ”„๋ผ์ด๋ฒ„์‹œ ์ธ์‹ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•ด์„œ AI๊ฐ€ ๊ฐœ์ธ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฃจ๋Š”์ง€ ์ ๊ฒ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ๊ตฌ์š”.

    InMind๋Š” AI ๋ชจ๋ธ์ด ๊ฐœ๋ณ„ ์‚ฌ์šฉ์ž์˜ ๋…ํŠนํ•œ ์ถ”๋ก  ์Šคํƒ€์ผ์— ์ ์‘ํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ํ…Œ์ŠคํŠธํ•˜๋Š” ํ‰๊ฐ€ ๋„๊ตฌ๋กœ, ๋งž์ถคํ˜• ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅ์„ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค.

  • ๋” ์–ด๋ ค์šด ํ”„๋ก ํ‹ฐ์–ด(Harder Frontiers)

    UQ(Unsolved Questions)๋Š” AI ๋ชจ๋ธ์ด ๊ณ ์ •๋œ ํ…Œ์ŠคํŠธ์…‹์ด ์•„๋‹ˆ๋ผ ์•„์ง ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ ๋ฌธ์ œ์— ๋Œ€ํ•œ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ํ˜์‹ ์ ์ธ ๋ฒค์น˜๋งˆํฌ์ธ๋ฐ, ๋‹จ์ˆœํžˆ ์•”๊ธฐ๋œ ๋‹ต๋ณ€์„ ํ”ผํ•˜๊ณ  ์ฐฝ์˜์  ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ํ•˜๊ฒŒ๋” ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค - AI๊ฐ€ ์‹ค์ œ๋กœ ๊นŠ์ด ์žˆ๋Š” ์‚ฌ๊ณ ๋ฅผ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ  ์‹ถ์€ ๊ฑฐ์ฃ .

  • ๊ณผํ•™์  ์ถ”๋ก  ๋ถ„๋ฆฌ(Scientific Reasoning Disentangled)

    SCIREAS๋Š” ๋ชจ๋ธ์ด ๋‹จ์ˆœํžˆ ์‚ฌ์‹ค์„ ๊ธฐ์–ตํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ ๊ณผํ•™์ ์œผ๋กœ "์ƒ๊ฐ"ํ•  ์ˆ˜ ์žˆ๋Š”์ง€, ๋„๋ฉ”์ธ ์ง€์‹๊ณผ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๋ถ„๋ฆฌํ•ด ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค.

์œ„์—์„œ ๋ง์”€๋“œ๋ฆฐ, ์ƒˆ๋กญ๊ฒŒ ๋“ฑ์žฅํ•˜๊ณ  ์žˆ๋Š” ๋ฒค์น˜๋งˆํฌ๋“ค์€, ๊ธฐ์กด์— ์šฐ๋ฆฌ๊ฐ€ ๋งŽ์ด ๋“ค์–ด์˜จ MMLU๋‚˜ GSM8K ๊ฐ™์€ ๋ฆฌ๋”๋ณด๋“œํ•˜๊ณ ๋Š” ํฐ ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ โ€˜๋ฒค์น˜๋งˆํฌโ€™๋Š”, โ€œ๊ณ ์ •๋œ ์งˆ๋ฌธ์—์„œ ๋ˆ„๊ฐ€ ์ตœ๊ณ  ์ ์ˆ˜๋ฅผ ๋‚ด๋Š”๊ฐ€โ€๊ฐ€ ์•„๋‹ˆ๋ผ, ์—์ด์ „ํŠธ๊ฐ€ ์›Œํฌํ”Œ๋กœ๋ฅผ ํƒ์ƒ‰ํ•˜๊ณ , ํ”„๋ผ์ด๋ฒ„์‹œ๋ฅผ ์ ์ ˆํžˆ ์กด์ค‘ํ•˜๊ณ , ์ „๋ฌธ ๋ถ„์•ผ๋ฅผ ์ œ๋Œ€๋กœ ๋งˆ์Šคํ„ฐํ•˜๊ณ , ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ฃผ์–ด์ง€๋Š” ํ™˜๊ฒฝ์—์„œ ์ถ”๋ก ์„ ํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€๋ฅผ ๋ฌป๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฒฐ๊ตญ, ๊ฒ‰์œผ๋กœ๋Š” ๋‹จ์ˆœํ•œ ๋ฒค์น˜๋งˆํฌ์ฒ˜๋Ÿผ ๋ณด์ผ์ง€ ๋ชจ๋ฅด์ง€๋งŒ, ์‚ฌ์‹ค์€ AI๊ฐ€ ๊ฐ€์ ธ์•ผ ํ•˜๋Š” ์ง„์งœ ์—ญ๋Ÿ‰์ด ๋ฌด์—‡์ด์–ด์•ผ ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด, ๊ณต๊ฒฉ์ ์ธ ์ฃผ์žฅ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ด์š”. ์ด๋Ÿฐ ์ฃผ์žฅ, ์ด๋Ÿฐ ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ๋“ค์ด ์•ž์œผ๋กœ AI์˜ ๋ฐœ์ „ ๋ฐฉํ–ฅ๊ณผ ํ”„๋ ˆ์ž„์„ ์„ค์ •ํ•˜๊ฒŒ ๋  ๊ฒ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ , ์–ด๋–ค ๋ฒค์น˜๋งˆํฌ๋ฅผ ์„ ํƒํ•˜๋А๋ƒ๊ฐ€ ์‹œ์Šคํ…œ ๊ทธ ์ž์ฒด๋งŒํผ์ด๋‚˜ ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๊ตฌ์š”.

์•ž์œผ๋กœ์˜ ์‹œ์ฆŒ, ๋” ํฅ๋ฏธ๋กœ์šด ๋ฒค์น˜๋งˆํฌ, ๊ทธ๋ฆฌ๊ณ  ํ‰๊ฐ€ ์‹œ์Šคํ…œ์ด ๋“ฑ์žฅํ•˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•ฉ๋‹ˆ๋‹ค.

*์•„์ง ํŠœ๋ง ํฌ์ŠคํŠธ ์ฝ”๋ฆฌ์•„ ๊ตฌ๋… ์•ˆ ํ•˜์…จ๋‚˜์š”? ๊ตฌ๋…ํ•ด ์ฃผ์‹œ๋ฉด ๋งค์ฃผ ์ค‘์š”ํ•œ AI ๋‰ด์Šค๋ฅผ ์ •๋ฆฌํ•œ ๋‹ค์ด์ œ์ŠคํŠธ๋ฅผ ๋ฐ›์œผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

ํŠธ์œ„ํ„ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ (Twitter Library) ๐Ÿฆ

ํŠœ๋ง ํฌ์ŠคํŠธ์˜ ํŠธ์œ„ํ„ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ, ์˜ค๋žœ๋งŒ์— ์ฐพ์•„๋ต™์Šต๋‹ˆ๋‹ค!

์ด๋ฒˆ ์ฃผ, ๋ญ ๋งํ•  ๊ฒƒ ์—†์ด ๋ชจ๋“  ๋ถ„์ด ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ชจ๋ธ, ๊ทธ ์ค‘์—๋„ ํŠนํžˆ ๊ตฌ๊ธ€์˜ ๋‚˜๋…ธ ๋ฐ”๋‚˜๋‚˜(Nano-Banana) ์ด์•ผ๊ธฐ์— ์—ด๊ด‘ํ•˜๊ณ  ์žˆ๋Š”๋ฐ์š”. ๊ทธ๋ž˜์„œ ์˜ค๋Š˜ ํ•œ ๋ฒˆ ์ด๋ฏธ์ง€ ์ƒ์„ฑ, ํŽธ์ง‘, ๋ฉ€ํ‹ฐ ํ„ด ์ด๋ฏธ์ง€ ์—…๋ฐ์ดํŠธ(Multi-turn Refinement) ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค๋ฉด ํ•œ ๋ฒˆ ์‹œํ—˜ํ•ด ๋ณผ ๋งŒํ•œ ์ดˆ๊ฐ•๋ ฅ ๋ชจ๋ธ 10๊ฐœ๋ฅผ ์†Œ๊ฐœํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค:

ํŠœ๋ง ํฌ์ŠคํŠธ ์ฝ”๋ฆฌ์•„ํŒ€์ด ์ฝ๊ณ  ์žˆ๋Š” ๊ฒƒ๋“ค

์ด ๊ธ€์—์„œ๋Š” ์•จ๋Ÿฐ ํŠœ๋ง์ด 1950๋…„์— ๋ฐœํ‘œํ•œ "Computing Machinery and Intelligence" ๋…ผ๋ฌธ์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•ด์„œ "๊ธฐ๊ณ„๋Š” ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€?"๋ผ๋Š” ์งˆ๋ฌธ์„ ํƒ๊ตฌํ•˜๋Š”๋ฐ, "์ด๋ฏธํ…Œ์ด์…˜ ๊ฒŒ์ž„"(์ดํ›„์— ํŠœ๋ง ํ…Œ์ŠคํŠธ๋กœ ๋ฐœ์ „)์„ ํ†ตํ•ด์„œ ์ ‘๊ทผํ•˜๋Š” ํŠœ๋ง์˜ ๋…์ฐฝ์ ์ธ ์ ‘๊ทผ๋ฒ•์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์•จ๋Ÿฐ ํŠœ๋ง์€ ๊ธฐ๊ณ„์™€ ์ธ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๋ช…ํ™•ํžˆ ๊ทœ์ •ํ•˜๊ธฐ๋ณด๋‹ค๋Š”, ๊ธฐ๊ณ„๊ฐ€ ์ธ๊ฐ„์ฒ˜๋Ÿผ ํ–‰๋™ํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•จ์œผ๋กœ์จ ์‚ฌ๊ณ ์˜ ๋ณธ์งˆ์„ ๊ฐ„์ ‘์ ์œผ๋กœ ๋“œ๋Ÿฌ๋‚ด๋ ค ํ–ˆ๋Š”๋ฐ์š”, ์ด๋Š” AI๊ฐ€ ๋‹จ์ˆœํ•œ ๊ณ„์‚ฐ ๋„๊ตฌ๋ฅผ ๋„˜์–ด์„œ ์ง€๋Šฅ์˜ ๊ฒฝ๊ณ„๋ฅผ ์–ด๋–ป๊ฒŒ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋Š”์ง€์— ๋Œ€ํ•œ ์ธ์‚ฌ์ดํŠธ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด ๋…ผ์˜๋Š” AI์˜ ํ˜„์žฌ์™€ ๋ฏธ๋ž˜๋ฅผ ์ƒ๊ฐํ•˜๋ฉฐ ์šฐ๋ฆฌ์—๊ฒŒ ๋งŽ์€ ์˜๊ฐ์„ ์ค„ ๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

2021๋…„๋ถ€ํ„ฐ Madrona์—์„œ Intelligent Applications๋ผ๋Š” ๋ฆฌ์ŠคํŠธ๋ฅผ ๋ฐœํ‘œํ•˜๋ฉด์„œ AI ๋ถ„์•ผ์˜ ๋ฐœ์ „์ƒ์„ ํŒ”๋กœ์šฐ์—…ํ•˜๊ณ  ์žˆ๋Š”๋ฐ์š”. ์˜ฌํ•ด๋„ ์—ญ์‹œ AI ๊ธฐ์ˆ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์†Œํ”„ํŠธ์›จ์–ด์˜ ๋ฏธ๋ž˜๋ฅผ ๋‹ค์ ธ๊ฐ€๋Š” 40๊ฐœ์˜ ํ”„๋ผ์ด๋น— ํ…Œํฌ ๊ธฐ์—…์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ธํ”„๋ผ, ์‚ฐ์—…๋ณ„ AI ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜, ๊ทธ๋ฆฌ๊ณ  ์†Œ๋น„์ž ์ธํ„ฐํŽ˜์ด์Šค์˜ ํ˜์‹ ์ ์ธ ๋ณ€ํ™”๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ์Šคํƒ€ํŠธ์—…๋“ค์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. 340๊ฐœ ์ด์ƒ์˜ ํ›„๋ณด ์ค‘ 27๊ฐœ๊ฐ€ ์ฒ˜์Œ์œผ๋กœ ์ด๋ฆ„์„ ์˜ฌ๋ ธ๊ณ , ๋‹จ ํ•˜๋‚˜ Databricks๋งŒ์ด 2021๋…„๋ถ€ํ„ฐ ๋งค๋…„ ๋ฆฌ์ŠคํŠธ์— ์˜ค๋ฅด๋ฉด์„œ ๊พธ์ค€ํ•˜๊ฒŒ ๊ทธ ์ด๋ฆ„์„ ์ž๋ž‘ํ•˜๊ณ  ์žˆ๋„ค์š”.

์ƒˆ๋กœ ๋‚˜์˜จ, ์ฃผ๋ชฉํ•  ๋งŒํ•œ ์—ฐ๊ตฌ ๋…ผ๋ฌธ

โ€˜์ฃผ๋ชฉํ•  ๋งŒํ•œ ์ตœ์‹ ์˜ AI ๋ชจ๋ธโ€™์„ ๋จผ์ € ์†Œ๊ฐœํ•˜๊ณ , ๊ฐ ์˜์—ญ๋ณ„๋กœ โ€˜Top Pickโ€™์€ ํ•ด๋‹น ๋…ผ๋ฌธ ์•ž์— ๋ณ„ํ‘œ(๐ŸŒŸ)๋กœ ํ‘œ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค!

์ฃผ๋ชฉํ•  ๋งŒํ•œ ์ตœ์‹  AI ๋ชจ๋ธ

  • FastVLM: Efficient Vision Encoding for Vision Language Models

    FastVLM์€ ๊ณ ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋ฅผ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ํšจ์œจ์ ์ธ ๋น„์ „ ์–ธ์–ด ๋ชจ๋ธ(VLM)๋กœ, ์ƒˆ๋กœ์šด ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๋น„์ „ ์ธ์ฝ”๋” FastViTHD๋ฅผ ๋„์ž…ํ•ด์„œ ์ธ์ฝ”๋”ฉ ์ง€์—ฐ์„ ์ค„์ด๊ณ , ์‹œ๊ฐ ํ† ํฐ ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•ด์„œ ์ €์ง€์—ฐ ๊ณ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ๊ณ , ๊ธฐ์กด ๋ชจ๋ธ ๋Œ€๋น„ 85๋ฐฐ ๋น ๋ฅธ Time-to-First-Token(TTFT), 3.4๋ฐฐ ์ž‘์€ ๋น„์ „ ์ธ์ฝ”๋” ํฌ๊ธฐ๋กœ LLaVA-OneVision๊ณผ ๋™๋“ฑํ•œ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•œ ์ ์ด ๋‘๋“œ๋Ÿฌ์ง„ ์ฐจ๋ณ„์ ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

  • OLMoASR: A series of open speech recognition models
    ์—ฌ์„ฏ ๊ฐœ์˜ ์™„์ „ํžˆ ๊ณต๊ฐœ๋œ ASR ๋ชจ๋ธ(39M~1.5B ํŒŒ๋ผ๋ฏธํ„ฐ)๊ตฐ์œผ๋กœ, ์ตœ๋Œ€ 680K ์‹œ๊ฐ„์˜ ์—„์„ ๋œ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ›ˆ๋ จ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค. 21๊ฐœ์˜ Unseen Test Set์œผ๋กœ ๋ฒค์น˜๋งˆํ‚นํ•œ ๊ฒฐ๊ณผ, OLMoASR-medium.en์€ ์งง์€ ํ˜•์‹/๊ธด ํ˜•์‹์—์„œ ๊ฐ๊ฐ 12.8%/11.0% WER์„ ๋‹ฌ์„ฑํ•˜๋ฉด์„œ Whisper-medium.en๊ณผ ๋งž๋จน๋Š” ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ๊ณ , ๊ฐ€์žฅ ํฐ ๋ชจ๋ธ์€ ๋™์ผํ•œ ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จํ–ˆ์„ ๋•Œ Whisper-large์™€์˜ WER ๊ฒฉ์ฐจ๋ฅผ 0.4%๋กœ ์ค„์˜€์Šต๋‹ˆ๋‹ค. 3M ์‹œ๊ฐ„ ํ’€์—์„œ 1M ์‹œ๊ฐ„์œผ๋กœ ํ•„ํ„ฐ๋ง๋œ ๋ฐ์ดํ„ฐ๋กœ ๋งŒ๋“ค์–ด์ง„ OLMoASR๋Š” ์žฌํ˜„์„ฑ, ์—„๊ฒฉํ•œ ๋ฐ์ดํ„ฐ ํ๋ ˆ์ด์…˜, ํˆฌ๋ช…์„ฑ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

  • gpt-realtime and Realtime API updates for production voice agents
    ์ด Speech-to-Speech ๋ชจ๋ธ์€ Big Bench Audio์—์„œ 82.8% ์ •ํ™•๋„, MultiChallenge์—์„œ 30.5% ์ •ํ™•๋„๋ฅผ ๋‹ฌ์„ฑํ•˜๋ฉด์„œ ์ด์ „ ๋ฒ„์ „์„ ๋Šฅ๊ฐ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ์ž…๋ ฅ, SIP ์ „ํ™” ํ˜ธ์ถœ, ์›๊ฒฉ MCP ์„œ๋ฒ„๋ฅผ ์ง€์›ํ•˜๊ณ , ๊ธฐ๋Šฅ ํ˜ธ์ถœ ์ •ํ™•๋„๋Š” 66.5%๋กœ ํ–ฅ์ƒ๋˜์—ˆ๊ณ , ๋‘ ๊ฐœ์˜ ์ƒˆ๋กœ์šด ๋ชฉ์†Œ๋ฆฌ Marin๊ณผ Cedar๊ฐ€ ์ž์—ฐ์Šค๋Ÿฌ์›€์„ ๋”ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ „ํ†ต์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ๋‹ฌ๋ฆฌ ํ•œ ๋‹จ๊ณ„๋กœ ์˜ค๋””์˜ค๋ฅผ ์ฒ˜๋ฆฌํ•ด์„œ ์ง€์—ฐ์„ ์ค„์˜€์Šต๋‹ˆ๋‹ค.

  • InternVL3.5: Advancing open-source multimodal models in versatility, reasoning, and efficiency
    LLM ๊ธฐ๋ฐ˜์˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ ํŒจ๋ฐ€๋ฆฌ๋กœ, Cascade Reinforcement Learning(์˜คํ”„๋ผ์ธ + ์˜จ๋ผ์ธ RL)์„ ํ†ตํ•ด์„œ ์ถ”๋ก ์„ ๊ฐ•ํ™”ํ•ด์„œ MMMU๋ผ๋“ ๊ฐ€ MathVista ๊ฐ™์€ ์ž‘์—…์—์„œ +16.0%๋ผ๋Š” ์„ฑ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. Visual Resolution Router(ViR)๋Š” ์‹œ๊ฐ ํ† ํฐ ํ•ด์ƒ๋„๋ฅผ ๋‹ค์ด๋‚˜๋ฏนํ•˜๊ฒŒ ์กฐ์ •ํ•˜๊ณ , Decoupled Vision-Language Deployment(DvD)๋Š” GPU ๋ถ€ํ•˜๋ฅผ ๊ท ํ˜• ์žˆ๊ฒŒ ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค. InternVL3.5-241B-A28B๋Š” 4.05๋ฐฐ ๋น ๋ฅธ ์ถ”๋ก  ์„ฑ๋Šฅ, ๊ทธ๋ฆฌ๊ณ  ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ ์ค‘ ์ผ๋ฐ˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ฐ ์—์ด์ „ํŠธ ์ž‘์—…์—์„œ ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

  • Hermes 4 technical report
    ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ถ”๋ก  LLM ํŒจ๋ฐ€๋ฆฌ๋กœ, 5M ํฌ์ŠคํŠธ ํŠธ๋ ˆ์ด๋‹ ์ƒ˜ํ”Œ(19B ํ† ํฐ)์„ ์‚ฌ์šฉํ•ด์„œ ๊ตฌ์ถ•๋˜์—ˆ๋Š”๋ฐ, ๊ทธ ์ค‘ 3.5M์€ ์ตœ๋Œ€ 16K ํ† ํฐ ๊ธธ์ด์˜ ์ถ”๋ก  ์ค‘์‹ฌ ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค. DataForge๋ฅผ ํ†ตํ•ด์„œ ๊ตฌ์กฐํ™”๋œ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๊ณ , Atropos๋กœ ์ž‘์—…๋ณ„ RL ํ™˜๊ฒฝ์—์„œ ๋ฆฌ์ ์…˜ ์ƒ˜ํ”Œ๋ง์„ ์ง„ํ–‰ํ–ˆ๋Š”๋ฐ, 14B/70B/405B ๋ชจ๋ธ์€ AIMEโ€™24์—์„œ 81.9%, LiveCodeBench์—์„œ 61.3%๋ฅผ ๊ธฐ๋กํ•ด DeepSeek-R1์„ ๋Šฅ๊ฐ€ํ•˜๋ฉด์„œ๋„ ๊ณผ๋„ํ•œ ์ถœ๋ ฅ์„ 78% ์ค„์˜€์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ฐ€์ค‘์น˜์™€ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๊ฐ€ ๊ณต๊ฐœ๋˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

  • rStar2-Agent: Agentic reasoning technical report
    ์ด 14B ํŒŒ๋ผ๋ฏธํ„ฐ ์‚ฌ์ด์ฆˆ์˜ ์ˆ˜ํ•™ ์ถ”๋ก  ๋ชจ๋ธ์€ ์—์ด์ „ํ‹ฑ RL๋กœ ํ›ˆ๋ จํ–ˆ๋Š”๋ฐ, GRPO-RoC๋ผ๋Š” RL ์ „๋žต์„ ํ†ตํ•ด์„œ ๋…ธ์ด์ฆˆ๊ฐ€ ๋งŽ์€ ์ฝ”๋“œ ํ™˜๊ฒฝ์„ ์ž˜ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๋‹จ 64 MI300X GPU๋งŒ ์‚ฌ์šฉํ•ด์„œ ํšจ์œจ์ ์œผ๋กœ ํ›ˆ๋ จ์‹œ์ผฐ๊ณ , 510๋ฒˆ์˜ RL ๋‹จ๊ณ„๋กœ AIME24์—์„œ 80.6%, AIME25์—์„œ 69.8%๋ฅผ ๋‹ฌ์„ฑํ•ด์„œ DeepSeek-R1(671B)์„ ๋Šฅ๊ฐ€ํ•˜๋Š” ๋ชจ์Šต์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ Alignment, ๊ณผํ•™์  ์ถ”๋ก , ์—์ด์ „ํŠธ์˜ ๋„๊ตฌ ์‚ฌ์šฉ ์ž‘์—… ๋“ฑ์— ๋Œ€ํ•ด์„œ๋„ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜ ๋œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

ํšจ์œจ์„ฑ ๋ฐ ๊ฐ€์†(Acceleration)

  • ๐ŸŒŸ Diffusion Language Models Know the Answer Before Decoding
    ๋””ํ“จ์ „ ์–ธ์–ด ๋ชจ๋ธ(Diffusion Language Model)์˜ ์ถ”๋ก  ์†๋„๋ฅผ ๋†’์ด๊ธฐ ์œ„ํ•ด์„œ, ์กฐ๊ธฐ ์ˆ˜๋ ด ํ˜„์ƒ์„ ๊ฐ์ง€ํ•˜๊ณ  Full Refinement ์ „์— ํ† ํฐ์„ ํ™•์ •(Commit)ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค.
    โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning
    ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์–ด ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ƒˆ๋กญ๊ฒŒ ์„ค๊ณ„ํ•ด์„œ, ์ „๋ฌธ๊ฐ€ ๋ชจ๋ธ(MoE: Mixture of Experts)์˜ ํšจ์œจ์„ฑ๊ณผ ์œ ์‚ฌํ•œ ์ˆ˜์ค€์„ ๋‹ฌ์„ฑํ•˜๋ฉด์„œ ๋” ๊ธด ์ปจํ…์ŠคํŠธ(Long Context)์— ๋Œ€ํ•ด์„œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€๋งŒ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ๋น„์šฉ์€ ๋‚ฎ์€ ๊ธฐ๋ฒ•์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • ๐ŸŒŸ Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference
    ๋Œ€๊ทœ๋ชจ LLM ์„œ๋น™์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•ด์„œ HeteroScale์ด๋ผ๋Š” ํ”„๋ ˆ์ž„์›์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค. ์ด ํ”„๋ ˆ์ž„์›์€ ์ด์งˆ์ ์ธ GPU๋“ค ์‚ฌ์ด์—์„œ ํ”„๋ฆฌํ•„(Prefill) ๋‹จ๊ณ„์™€ ๋””์ฝ”๋“œ(Decode) ๋‹จ๊ณ„๋ฅผ ์กฐ์œจํ•˜๋ฉด์„œ ์ž๋™ ์Šค์ผ€์ผ๋ง์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•ด์„œ GPU ํ™œ์šฉ๋ฅ ์€ 26.6% ํ–ฅ์ƒ๋˜๊ณ , ํ•˜๋ฃจ ์ˆ˜์‹ญ๋งŒ GPU ์‹œ๊ฐ„(GPU-Hours)์„ ์ ˆ์•ฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

์ถ”๋ก  ๊ฐ์‹œ ๋ฐ ํ†ต์ œ

  • ๐ŸŒŸ StepWiser: Stepwise Generative Judges for Wiser Reasoning
    ์ƒ์„ฑํ˜• ๋ณด์ƒ ๋ชจ๋ธ(Generative Reward Models)์„ ํ›ˆ๋ จ์‹œ์ผœ์„œ, ์ค‘๊ฐ„ ๋‹จ๊ณ„๋ฅผ '๋ฉ”ํƒ€ ์ถ”๋ก (Meta-Reason)'ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด ํŒ๋‹จ ์ •ํ™•๋„์™€ ์ถ”๋ก  ํƒ์ƒ‰(Inference Search) ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • ๐ŸŒŸ ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
    ๊ณ„์‚ฐ ๋น„์šฉ๊ณผ ์„ฑ๋Šฅ ์‚ฌ์ด์˜ ๊ท ํ˜•์„ ๋งž์ถ”๊ธฐ ์œ„ํ•ด์„œ ๊ณ (High), ์ค‘(Medium), ์ €(Low) ์ˆ˜์ค€์˜ ์ด์‚ฐ ์ถ”๋ก  ๋ชจ๋“œ(Discrete Reasoning Modes)๋ฅผ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?
    ์—ฐ์„ฑ ์ถ”๋ก (Soft-Reasoning) ๊ณผ์ œ์—์„œ CoT(Chain-of-Thought) ๋ฐฉ์‹์˜ ์ถฉ์‹ค์„ฑ(Faithfulness)์„ ๋ถ„์„ํ•˜๊ณ , ์˜ํ–ฅ๋ ฅ(Influence)๊ณผ ์‹ ๋ขฐ์„ฑ(Reliability)์ด ๋ฐ˜๋“œ์‹œ ์ผ์น˜ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

๋„๊ตฌ ์‚ฌ์šฉ ๋ฐ ์ฆ๊ฐ• ํ•™์Šต

  • ๐ŸŒŸ Provable Benefits of In-Tool Learning for Large Language Models
    ํˆด๋กœ ๋ณด๊ฐ•๋œ ๋ชจ๋ธ(Tool-Augmented Models)์ด Factual Recall์„ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ํ•œ๊ณ„๋ฅผ ๋„˜์–ด ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด ์ค€๋‹ค๋Š” ๊ฑธ ์ž…์ฆํ•˜๊ณ , ๋ชจ๋ธ ๋‚ด๋ถ€์— ์ €์žฅ๋œ ์ง€์‹(In-Weight Memorization)๋ณด๋‹ค ๋” ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • ๐ŸŒŸ Understanding Tool-Integrated Reasoning
    ํˆด ๊ธฐ๋ฐ˜์˜ ์ถ”๋ก (Tool-Augmented Reasoning)์˜ ํšจ๊ณผ์— ๋Œ€ํ•œ ์ฒซ ์ด๋ก ์  ์ฆ๋ช…(Theoretical Proof)์„ ์ œ์‹œํ•˜๊ณ , ํˆด ํ™œ์šฉ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ ASPO๋ผ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

์ฝ”๋“œ, ์˜์ƒ, ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์‹œ์Šคํ…œ

  • ๐ŸŒŸ Mixture of Contexts for Long Video Generation
    ๊ธด ๋น„๋””์˜ค ์ƒ์„ฑ ์ž‘์—…์—์„œ์˜ ์ผ๊ด€์„ฑ์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋””ํ“จ์ „ ํŠธ๋žœ์Šคํฌ๋จธ(Diffusion Transformer)์— ํฌ์†Œ ์–ดํ…์…˜ ๋ผ์šฐํŒ…(Sparse Attention Routing)์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • Self-Rewarding Vision-Language Model via Reasoning Decomposition
    ์‹œ๊ฐ-์–ธ์–ด ๋ชจ๋ธ(VLM: Vision-Language Model)์—์„œ ์ง€๊ฐ(Perception)๊ณผ ์ถ”๋ก (Reasoning)์„ ๋ถ„๋ฆฌํ•˜๊ณ , ๋…๋ฆฝ์ ์ธ ์ง€๊ฐ(Self-Contained Perception)์— ๋ณด์ƒ์„ ์ฃผ๋Š” ๋ฐฉ์‹์œผ๋กœ ์‹œ๊ฐ์  ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๊ฐ•ํ™”ํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • ๐ŸŒŸPref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
    Pairwise Preference Rewards์™€ ํ†ตํ•ฉ ๋ฒค์น˜๋งˆํฌ๋ฅผ ํ†ตํ•ด์„œ ํ…์ŠคํŠธ-ํˆฌ-์ด๋ฏธ์ง€ ๊ฐ•ํ™”ํ•™์Šต์˜ ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

์—์ด์ „ํŠธ์˜ ํ›ˆ๋ จ

  • CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent
    ๊ณผํ•™ ์ปดํ“จํŒ… GUI ํ™˜๊ฒฝ์—์„œ ์ผ๋ฐ˜์ ์ธ ๊ณ„ํš์ž์™€ ์ „๋ฌธ์„ฑ์„ ๊ฐ€์ง„ ์‹คํ–‰์ž๋ฅผ ๋ถ„๋ฆฌ๋œ ๊ฐ•ํ™”ํ•™์Šต(Decoupled Reinforcement Learning) ๋ฐฉ์‹์œผ๋กœ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

  • UItron: Foundational GUI agent with advanced perception and planning
    100๋งŒ ์Šคํ… ์ด์ƒ์— ๊ฑธ์นœ SFT(Supervised Fine-Tuning)์™€ ์ปค๋ฆฌํ˜๋Ÿผ ๊ฐ•ํ™”ํ•™์Šต(Curriculum Reinforcement Learning)์„ ํ†ตํ•ด์„œ, ๋ชจ๋ฐ”์ผ ๋ฐ PC์šฉ ๋Œ€๊ทœ๋ชจ GUI ์—์ด์ „ํŠธ๋ฅผ ํ›ˆ๋ จ์‹œ์ผœ์„œ ์ค‘๊ตญ ์•ฑ์˜ ์ง€๊ฐ, ๊ทธ๋ผ์šด๋”ฉ(Grounding), ๊ณผ์ œ ๊ณ„ํš(Task Planning) ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

์—์ด์ „ํ‹ฑ ์‹œ์Šคํ…œ์˜ ํ”„๋ผ์ด๋ฒ„์‹œ, ์•ˆ์ „, ๋ณด์•ˆ

  • ๐ŸŒŸ Servant, Stalker, Predator: How An Honest, Helpful, And Harmless (3H) Agent Unlocks Adversarial Skills
    MCP(Model Context Protocol) ์—์ด์ „ํŠธ์˜ ์ทจ์•ฝ์ ์„ ๋“œ๋Ÿฌ๋‚ด๋Š”๋ฐ, ์ •์ƒ์ ์ธ ์ž‘์—…๋“ค์ด ์ฒด์ด๋‹(Chaining)์„ ํ†ตํ•ด ์„œ๋น„์Šค ๊ฒฉ๋ฆฌ(Service Isolation)๋ฅผ ์šฐํšŒํ•˜๊ณ  ๋ณด์•ˆ์„ ์œ„ํ˜‘ํ•˜๊ฒŒ ๋  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. โ€”> [๋…ผ๋ฌธ ๋ณด๊ธฐ]

*๋ฆฌ๋ทฐ๋ฅผ ๋‚จ๊ธฐ์‹œ๋ ค๋ฉด ๋กœ๊ทธ์ธํ•˜์‹œ๊ฑฐ๋‚˜ ๊ตฌ๋…ํ•ด ์ฃผ์„ธ์š”. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

์ฝ์–ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํ”„๋ฆฌ๋ฏธ์—„ ๊ตฌ๋…์ž๊ฐ€ ๋˜์–ด์ฃผ์‹œ๋ฉด ํŠœ๋ง ํฌ์ŠคํŠธ ์ฝ”๋ฆฌ์•„์˜ ์ œ์ž‘์— ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค!

Reply

or to participate.